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ABSTRACT 

A study assessed the role of orthographic structure 
in college students' perceptual recognition and judgment of letter 
strings. Lexical status, word frequency, bigram frequency, log bigram 
frequency, and regularity of letter sequencing were orthogonally 
varied across a series of experiments. Six- letter words and their 
anagrams were used as test stimuli in a target-search task. Results 
showed that words were recognized better than their corresponding 
equally well-structured anagrams, but that word frequency had little 
effect. Orthographically regular anagrams were recognized better than 
irregular anagrams, wheveas log bigram frequency did net have an 
effect. In contrast, pest hoc correlations revealed that log bigram 
frequency did correlate significantly with individual item 
performance. In a final experiment, subjects judged which of a pair 
of letter strings most resembled English in terms of either the 
frequency or the regularity of letter sequences. Findings revealed an 
influence of essentially the same dimensions of orthographic 
structure as that revealed by the perceptual recognition task. The 
overall results provided evidence for lexical status, regularity of 
letter sequencing, and frequency of letter sequencing as important 
dimensions in the psychologically real description of orthographic 
structure. ( Author /FL) 
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Abstract 



The present research assessed the role of orthographic 
structure in the perceptual recognition and the judgment of 
letter strings. Lexical status , word frequency , bigram 
frequency, log bigram frequency, and regularity of letter 
sequencing were orthogonally varied across a series of 
experiments. Six-letter words and their anagrams were used 
as test stimuli in a target-search task. Words were 
recognized better than their corresponding equally well- 
structured anagrams but word frequency had little effect. 
Orthographically regular anagrams were recognized better 
than irregular anagrams whereas log bigram frequency did not 
have an effect. In contrast, post hoc correlations revealed 
that log bigram frequency did correlate significantly with 
individual item performance . In a final experiment, 
subjects judged which of a pair of letter strings most 
resembled English in terms of either the frequency or the 
regularity of letter sequences. The results revealed an 
influence of essentially the same dimensions of orthographic 
structure as was revealed by the perceptual recognition 
task. The results provide evidence for lexical status, 
regularity of lettter sequencing, and frequency of letter 
sequencing as important dimensions in the psychologically - 
real description of orthographic structure. 



Orthographic structure and visual 
processing of letters and words. 

It is widely acknowledged that the reader contributes 
as much or more to reading than does the "information" on 
the printed page. One compelling issue in reading research 
is how the reader's higher-order knowledge of the language 
interacts with lower-level perceptual analyses during 
reading. The specific question addressed in the present 
paper is how the reader's knowledge about orthographic 
structure is combined with the information derived from 
visual featural analysis in letter and word recognition. 
Orthographic structure refers to the spelling constraints in 
a written language. Visual featural analysis refers to the 
evaluation of component properties of letters leading to 
letter and word recognition. Given the considerable amount 
of predictability in English writing, we ask how the reader 
utilizes this orthographic structure in word recognition. 

Evaluation of the contributions of visual features and 
orthographic structure to word recognition can be 
facilitated by a detailed Jescription of the processes 
involved in reading. The description we use is part of a 
more general model of language processing model (Massaro, 
1975, 1973, 1979a; Massaro, Taylor, Venezky, Jastrzembski & 
Lucas, 1930). According to the model, reading can be viewed 
as a sequence of processing stages. Figure 1 presents a 
schematic representation of the stages of processing; at 
each stage of processing, memory and process components are 




Figure 1 A stage model of reading printed text. 
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represented. Each memory component (indicated by a 
rectangle) corresponds to the information available at a 
particular stage of processing. Each process component 
(indicated by a circle) corresponds to the operations 
appl ied to the information held by the memory component. 
The memory components are temporary storages except for 
long-term memory which is relatively permanent. It is 
assumed that long-term memory supplements the information at 
some of the processing stages. 

During reading, the light pattern reflected from a 
display of letters is transduced by the visual receptors as 
the feature detection process detects and transmits visual 
features to preperceptual visual storage (see Figure 1). As 
visual features enter in preperceptual visual storage, the 
primary recognition process attempts to transform these 
isolated features into a sequence of letters and spaces in 
synthesized visual memory. To do this, the primary 
recognition process can ut il ize information held in long- 
term memory. For the accomplished reader this includes a 
list of features for each letter of the alphabet along with 
information about the orthographic structure of the 
language. Accordingly, the primary recognition process uses 
both the visual features in preperceptual storage and 
knowledge of orthographic structure in long-term memory 
during the synthesis of letter strings. 

The primary recognition process operates on a number of 
letters simultaneously (in parallel). The visual features 



detected at each spatial location of the letter string 
define a set of possible letters for that position. The 
primary recognition process chooses from this set of 
candidates the letter alternative which has the best 
correspondence in terms of visual features. However, the 
selection of a letter can be facilitated by the reader's 
knowledge of orthographic structure . The primary 
recognition process therefore, attempts to utilize both the 
featural information in preperceptual storage and knowledge 
about the structure of letter strings in long-term memory. 
We assume that orthographic structure is utilized in the 
following manner: Upon presentation of a letter string, the 
primary recognition process begins integrating and 
synthesizing featural information passed on by feature 
detection to preperceptual visual storage. Featural 
information is resolved at different "ates and there is seme 
evidence that gross features are available before the more 
detailed features (Massaro & Schmuller, 1975). The primary 
recognition process is faced with a succession of partial 
information states. These partial information states are 
supplemented with knowledge about orthographic structure. 
Assume, for example, an initial th has been perceived in a 
letter string, and the features available for the next 
letter eliminate all alternatives except c and e. The 
primary recognition process would synthesize e without 
waiting for further visual information, since initial the is 
not acceptable, while initial the is. 

10 
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The primary recognition process transmits a sequence of 
recognized letters to synthesized visual memory. Figure 1 
shows how the secondary recognition process transforms this 
synthesized visual percept into a meaningful form in 
generated abstract memory. We assume secondary recognition 

m attempts to close off the letter string into a word. The 

secondary recognition p 4 ocess makes this transformation by 

% finding the best match between the letter string and a word 

in the lexicon in long-term memory. Each word in the 
lexicon contains both perceptual and conceptual codes. The 
word which is recognized is the one whose perceptual code 
gives the best match and whose conceptual code is most 
appropriate in that particular context. Knowledge of 
orthographic structure can also contribute to secondary 
recognition; word recognition can occur without complete 
recognition of all of the component letters. Given the 
letters bea and the viable alternatives 1^ and t in final 
position, only t makes a word, and therefore word 
identification (lexical access) can be achieved (Massaro, 
Note 1). 

Our goals in the present series of experiments are to 
4 provide a better understanding of the primary and secondary 

recognition process and to evaluate which aspect of 
orthographic structure the reader knows and uses. To assess 
how readers utilize knowledge about the structure of written 
language, it is necessary to state various descriptions of 
this structure and then to determine how well these 
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descriptions capture reading performance. Venezky and 
Massaro (1979), Massaro, Venezky, & Taylor (1979) and 
Massaro et al. (1930) have distinguished between two broad 
categories of orthographic structure : statisti ?al 
redundancy and rule-governed regularity . The first category 
includes all descriptions derived solely f om the frequency 
of letters and letter sequences in written texts. The 
second category includes all descriptions derived from the # 
phonological constraints in English and scribal conventions 
for the sequences of letters in words. Since a change in 
one category would not affect the other, the two categories 
were viewed as nonover lapping . The task then was to first 
decide which general category seemed to reflect the manner 
in which reader's store knowledge of orthographic structure 
and second , to determine precisely which specific 
description within that catrgory has the most psychological 
r eal ity . 

Massaro et al. (1979a, 1980) contrasced a specific 
statistical-redundancy description with a specific rule- 
governed description by comparing letter strings that varied 
orthogonally with respect to these descriptions. The * 
statistical redundancy measure was summed token single- 
letter frequency. The rule-governed regularity measure was 
a preliminary set of rules similar to those presented in 
Table 2 of the present paper. Letter strings were selected 
which represented the four combinations formed by a 
factorial arrangement of high and low frequency and regular 
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or irregular. In a series of experiments utilizing a 
target-search task, subjects were asked to indicate whether 
a target letter wes present in these letter strings. Both 
accuracy and reaction-time measures indicated psychological 
reality for both the frequency and the regularity 
description of orthograpnic structure, 

Massaro et al. (1930) formalized the language 
processing model to provide a quantitative description of 
the facilitative effect of orthographic structure on task 
accuracy. The basic assumption of the model is that 
knowledge of orthographl c structure contributes an 
independent source of information about the letter string. 
By an independent source of information, we mean that 
knowledge of orthographic structure does not modify or 
direct the feature detection process. Father, information 
about visual features and orthographic structure accumulates 
from sources that do not interact. Since information about 
structure adds to featural information, fewer visual 
features are necessary to resolve we 11- structured than 
poorly-structured strings. The model was applied to the 
target-search task by formalizing a decision algorithm 
assumed to be used by the subject when faced with partial 
information . The model provided a good quantitative 
description of the accuracy results. The parameters of the 
model were psychologically meaningful and the parameter 
values corresponding to the number of letters seen in the 
test string provided a quantitative measure of the 
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contribution of orthographic structure. According to the 
model, readers were able to recognize two additional letters 
in brief presentations of well-structured strings compared 
to poorly-structured strings. This is a substantial offset 
considering that two letters represent one-third of the 
six-letter test string. These results indicate that we had 
developed good initial approximations of both a description 
of orthographic structure and the means by which structure 
and visual features combine during word recognition. This 
bolstered our hope that a precise description of 
orthographic structure can eventually be determined and that 
a thorough understanding of the word recognition processes 
in reading can eventually be obtained. 

Massaro et al. (1980) also conducted a series of overt 
judgment experiments to assess which descriptions of 
orthographic structure are consciously available. We asked 
whether subjects could descriminate among the items on the 
basis of rule-g vcrned regularity or on the basis of 
statistical redundancy. Subjects were presented pairs of 
letter strings and asked to choose the member of each pair 
which most resembled written English. The instructions 
emphasized either a regularity or a statistical-redu idancy 
criterion. Subjects 1 judgments appeared to be more 
accurately described by rule-governed regularity than by 
statistical redundancy. In this way, the results from the 
overt judgment task paralleled the results from the target- 
search task. Evidently, readers not only use their 
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knowledge of orthographic structure during the word 
recognition process, but also are aware of this knowledge 
and can use it in tasks requiring decisions after the word 
recognition processes have been completed. As suggested by 
the model, orthographic structure appears to exert an 
influence on several stages of language processing (Massaro, 
1980). 

The factorial design of the Massaro et al. (1980) 
experiments contrasted just one measure of rule-governed 
regularity with one measure of statistical redundancy. 
Therefore, a large number of post hoc correlational analyses 
was conducted to evaluate a wide range of measures of 
orthographic structure. This was a first step towards 
refining our initial measures of orthographic structure. 
Through these correlations, it might be possible to 
determine the necessary refinements to reach our goal of a 
psychologically real description of orthographic structure. 
The dependent measure was the performance on each of 200 
test items. Position-sensitive summed log bigram frequency 
provided the best statistical-redundancy description of 
performance on the individual items. Furthermore, an 
improved rule-based regularity measure also provided a very 
good description. However, the regularity measure 
correlated very highly with the best frequency-based 
measure. For this reason, it was not possible in these 
experiments to make a definitive choice between rule- 
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governed and statistical-redundancy descriptions of 
structure. 

Since Massaro et al. (1980) were not successful in 
choosing between rule-governed regularity and statistical 
redundancy descriptions, the next step is to refine our 
measures of structure in a further attempt to select a 
single measure of structure. Given the best statistical- 
redundancy measure, it is possible to develop a new set of 
test itenrs to contrast this measure with an improved rule- 
governed regularity measure. We follow this logic in the 
present studies by factorially contrasting bigram frequency 
and regularity measures in target search and overt judgment 
tasks. Although bigram frequency and regularity are highly 
correlated, a design involving orthogonal contrasts might be 
sufficient to distinguish between them. As with the 
previous experiments (Massaro et al. 1980), it again will be 
necessary to examine post hoc correlations to determine 
whether some other measure might provide even a better 
description. By refining and repeatedly testing measures of 
structure, we hope to arrive at a single description that 
best reflects the reader's knowledge of orthographic 
structure . 



Experiments _1_ and 2 



Method 
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Subjects , Nine subjects were used in the first 
experiment and eleven were used in the second. All were 
Introductory Psychology student volunteers who received 
credit toward their course grade for participating. 
Additionally, they were all native English speakers, right- 
handed, had normal or corrected to normal vison, and had not 
participated in any of the other experiments. 

St imul i and apparatus . A sample of high-frequency 
words was obtained from a list of all six-letter words from 
Kucera and Francis (1967), subject to the constraints that 
the words had a frequency greater than or equal to 50, were 
net proper nouns, and did not have repeated letters. A 
similar list of words with a frequency of exactly three was 
used to obtain low- frequency words. For each word in these 
two lists, all possible 720 anagrams were generated and each 
of their summed-posit ional bigram frequencies was 
calculated. The bigram frequencies were based on counts 
given by Massaro et al. (1980) which were derived from the 
Kucera and Francis (1967) word list. Forty high-frequency 
and 40 low-frequency words were selected along with four 
anagrams of each word. The anagrams were selected so that 
they formed a factorial arrangement of high and low summed- 
positional bigram frequency and of being orthographically 
regular and irregular. Orthographic regularity was 
manipulated in the same manner as in previous experiments 
(Massaro et al., 1979, 1930). The rules for choosing 
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Table 1 

The Rules for Choosing Regular and Irregular Letter Str 

Letter strings were regarded as regular if they 
were phonologically legal and contained common vowel 
and consonant spellings. A letter string was regarded 
as orthographically irregular* if it contained at least 
one of the following spellings. 

a. phonologically illegal initial or final 
cluster (e.g., rlh ued or eigopn) 

b. orthographically illegal spelling for an 
initial final consonant or consonant 
cluster (e.g., xeoich or tmoreh) 

c. an illegal vowel spelling (e.g., c aei nm) 

d. a phonologically illegal medial cluster 
( e.g. , i lrm ed) 
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regular and irregular strings are given in Table 1. Some 
examples of the words and their respective anagrams are 
presented in Figure 2. Number and person have high word 
frequencies while hurdle and pigeon have low frequencies. 
The letter string rumben is a regular-high anagram of the 
word number , and helrud is a regular-low anagram of hurdle . 
The number in each cell gives the average summed-positional 
bigrain frequency for the items of that class. For example, 
the irregular-high anagrams of high frequency words have an 
average count of 5738. 

Twenty arbitrarily chosen high-frequency words and 
their anagrams as well as 20 low-frequency words and their 
anagrams were selected as stimuli for the first experiment. 
The remaining 20 high- and 20 low-frequency anagrams were 
used with new subjects in the second experiment. The letter 
strings for the two experiments are presented in Appendices 
1 and 2. 

The visual displays were generated by a DEC LSI-11 
computer under software control and presented on Tektronix 
Monitor 604 oscilloscope (Taylor, Klitzke, & Massaro, 1978a, 
1978b). These monitors employ a P31 phosphor which decays 
to .1% of stimulated luminance within 32 msec of stimulus 
offset. The alphabet consisted of lower-case nonserfied 
letters resembling the type Tont Univers 55. For an 
observer seated comfortably at an experimental station, the 
six-letter displays subtended about 1.9 degrees of visual 
angle horizontally and the distance from the top of an 
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Words 




Regular 



Orthographic 
Regularity 



Irregular 



Summed Positional 
Bigram Frequency 



High 


Low 


rumben 


runemb 


preson 
(5624) 


roneps 
(1415) 


hulder 


helrud 


gopine 
(4861) 


ginope 
(1106) 


bemrnu 


brnernu 


npsore 
(5738) 


pnseor 
(1424) 


rlhued 


Ideurh 


eiopng 
(4920) 


eigopn 
(1113) 



Figure 2. Example of the test words and their 
correspond ing anagrams from Experiments 1 and 2 . 
Within rnch of the five squares, the top two items 
correspond to high word frequency and the bottom two 
items correspond to low word frequency. The number in 
each of the ten cells represents the summed positional 
linear bigram frequency. 
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ascender to the bottom of a descender was about .4 degree. 
Up to four subjects could be tested in parallel in separate 
rooms . 

Procedure . A trial (see Figure 3) began with the 
presentation of a 250 msec fixation point. The fixation 
point was replaced by a test letter string, i.e., a word or 
an anagram, for a duration of 10-39 msec. The duration on a 
particular trial for each subject was determined by his or 
her accuracy. The duration was adjusted every 20 trials by 
a modified version of the PEST algorithm (Taylor 4 Creelman, 
1967) in order to keep the subject's average accuracy at 
about 75J. A masking stimulus followed the onset of the 
test string after a 70 msec interval. Therefore, the blank 
interval between the test stimulus and the masking stimulus 
was (70-t) msec, where t was the duration of the target 
string. The masking stimulus was composed of six nonsense 
letters. Each nonsense letter changed from trial to trial 
and was composed of a montage of randorcly-selected features 
of the test letters. The feature density of a nonsense 
letter was equal to that of the letter £. The size of the 
nonsense letters was equivalent to that of the test string. 
The duration of the mask was adjusted along with the 
duration of the test string. The mask remained on the 
screen for (*(0-t) msec, giving a range of durations of 1-30 
msec. The mask was followed by another blank interval and 
then the target letter. The second blank interval lasted 
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Figure 3. A schematic representation of the 
perceptual recognition task used in Experiments 1-7. 
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180 msec minus the duration of the mask. Therefore, the 
interval between the onset of the test letter string and the 
target letter was always 250 msec. The target letter 
remained on the screen until all subjects responded or for a 
maximum of four seconds. Finally , the interval between 

m trials was 500 msec. 

Subjects were instructed to indicate whether the target 
letter was present in the test string and to be as accurate 
as possible. The experiment consisted of a session of 100 
practice trials with a practice list that was comparable to 
the experimental list and two sessions of 400 experimental 
trials each. Within each session, each item was tested once 
as a target string and once as a catch string. On target 
trials, the target letter was selected randomly with 
replacement from the six letters in the test string. For 
catch trials, a target was selected randomly from the set of 
26 letters weighted by their probability of occurrence in 
the stimulus set. If the selected letter was present in the 
test string, additional drawings with replacement were made 
until an appropriate target letter was selected. Some 
letters did not occur in the test strings and therefore were 

4 never tested. A short rest break intervened between the two 

experimental sessions. The total time for the three 
sessions and the rest break was about 75 minutes. Both 
experiments were conducted in exactly the aaine manner except 
that different subjects and different items were used in 
each. 
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Results 

Two analyses of variance were performed on the 
percentage accuracy scores In the first analysis, word 
frequency , type of test letter string, target or catch trial, 
and subjects were factors. In the second analysis, the word 
data were eliminated ^nd regularity and bigra»n frequency 
wore factors in the design. Figure 4 shows the average « 
percentage correct on target and catch trials as a function 
of letter-string type in Experiment 1. There were large 
differences among the various types of letter strings, F (4, 
32) = 130.7, £ < .001. Regular items resulted in a 9.3* 
accuracy advantage over irregular items F (1, 8) = 74.7, £ < 
.001, while items of high summed-positional bigram frequency 
had 2.5% advantage over items of low summed-positional 
bigram frequency, F (1, 8) = 11. H, £ < .01. The advantage 
of high bigram frequency was limited to regular items, F (1, 
3) = 10.0, £ < .05. The difference in accuracy between 
words and the regular-high anagrams was 12.0?, F (1, 32) = 
23.3 £ < .001. There was no difference in accuracy between 
target (72. '42) and catch (77.2%) trials, F < 1, and this 
variable did not interact with letter-string type, F< 1. 

Figure 5 gives the average percentage correct for the 
high and low word frequency words and their anagrams as a 
function of letter string type. There was an overall 2.7% 
advantage for the low-frequency words and their anagrams, F 
(1, 8) = 9*86, £ < .015, and word frequency also interacted 
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Figure 4. Percentage correct as a function of 
display type for target and catch trials in Experiment 
1 . 
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Figure 5. Percentage correct as a function of 
display type for items corresponding to high and low 
word frequency in Experiment 1. 
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with letter-string type, F (4, 32) = 6.18, £ < .001. The 
overall effect of letter-string type was 23. 3% for the 
high-frequency words and their anagrams and 19*5* for the 
low- frequency words and their anagrams . This difference 
reflected the fact that high-frequency words were more 
accurate than low-frequency words, but that the reverse was 
the case for the four types of anagrams. Word frequency did 
not interact with target vs. catch trials nor was tiiere a 
three-way interaction with these variables and letter-string 
type (Fs < 1). 

Figure 6 gives performance for target and catch trials 
as a function of letter-string type in the second experiment 
using new items and new subjects. There were large 
differences among letter-string types, F (4, 40) = 92.76, £ 
< 001. Regular item, resulted in 8.7% greater accuracy 
than irregular items, F (1, 10) = 49.1, £ < .001. Items of 
high summed-positional bigram frequency resulted in 3.3% 
greater accuracy than items of low summed-positional bigram 
frequency, F (1, 10) = 12.2, £ < .05, but the advantage 
occurred only for regular items, F (1 10) = 8.2, £ < .025. 
The difference between words and refular-high anagrams was 
13.1%, F ( 1, 40) = 18.0, £ < .001). There was no 
difference in accuracy between target (74.1%) and catch 
(78.451) trials, F < 1, and this variable did not interact 
with letter string type, F < 1. 

Figure 7 gives average percentage correct for the high 
and low word frequency words and their anagrams as a 
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Figure 6. Percentage correct as a function of 
display type for target and catch trials in Experiment 
2. 
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Figure 7. Percentage correct as a function of 
display type for items corresponding to high and low 
word frequency in Experiment 2. 
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function of letter string type. There was a 2.5% advantage 
for items of the high-frequency words, F (1, 10) = 1.87, £ < 
.052, but word frequency did not enter into any 
interactions. The overall effect of letter-string type was 
21.3% for high-frequency items and 25.85S for low-frequency 
items. The interaction of word frequency and letter-string 
type found in the first experiment and shown i" Figure 4 was 
not replicated in the second experiment, F (4, MO) = 1.15, 
£ > .25. 

Correlational analysis 

The factorial design is limited in terms of providing a 
quantitative assessment of the importance of frequency and 
regularity measures of orthographic structure. The present 
design contrasted just one frequency measure against just 
one regularity measure. Therefore, post hoc correlational 
analyses were carried out to provide an analysis of a range 
of descriptions of orthographic structure. The independent 
variables used in this analysis included a number of 
measures based on frequency counts for letters, n-grams , and 
words, in addition to a few quantitative measures based on 
ortnographic rules. The dependent measure in all cases was 
average accuracy for each six-letter test item. The 
a curacy scores were obtained by averaging across subjects 
and across target and catch trials. Each of the two 
experiments used HO words, 20 each of high and low word 
frequency, and four corresponding anagrams for a total of 
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200 stimulus items per experiment. Each subject had been 
presented with each item twice as a target trial and twice 
as a catch trial. Accordingly, the accuracy score for each 
item in the first experiment was based on 36 observations (4 
replications x 9 subjects) while the accuracy score for the 
second experiment was based on 44 observations (4 
replications x 11 subjects). 

Frequency Measures 

The source of the frequency measures is based on a word 
corpus compiled by Kucera and Francis (1967). This corpus 
consisted of 500 samples of approximately 2,000 words each 
selected from 15 categories. A description of the corpus, 
its selection, and its processing are presented by Kucera 
and Francis (1967, pp. xvii-xxv). Massaro et al. (1980) 
used these words to derive the frequencies of occurrence of 
single letters, bigrams, and trigrams. A magnetic tape of 
the word count produced from the corpus (i.e., the "Rank 
List" in Kucera & Francis) was obtained. The words were 
sorted into 10 lists consisting of 1- to 10-letter words, 
respectively. Words longer than 10 characters were deleted 
as were items containing numbers, punctuation, or special 
codings for capitalizations, foreign alphabets, and unusual 
graphic features or symbols. This resulted in 10 lists of 
words, one for each letter length. These word lists formed 
the basis for counts of single letters, bigrams, and 
trigrams • 
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Tables were prepared by counting the occurrence of each 
n-gram at the position it occurred in words of a given 
length. The counts were token counts based upon the total 
number of occurrences of the words containing the ji-gram. A 
position-insensitive count (but still word length dependent) 
was also obtained for each n-gram by summing across the 
position-dependent counts. Because Kucera and Francis 
maintained a faithful count of the actual graphic patterns 
found in the corpus, their lis. contains rare words, 
typographic errors, foreign person and place names, and 
other ideosyncratic items. To limit the impact of such 
items on these tabulations, cut-off limits were established 
for both word frequency and number of samples. The cut-offs 
were a minimum of one occurrence in each of at least three 
samples. Thus, unusual words and usages, regardless of 
their frequency, were ignored unless they occurred in three 
or more separate samples. Although this limit was 
arbitrary, inspection of the word list in the low frequency 
range indicated that these were reasonable cut-offs. The 
single-letter tables and bigram tables for word lengths 3 
through 7 are presented in Massaro et al. (1980). 

Type counts are based on the number of word types that 
contain a given n-gram and these counts may also be relevant 
descriptors of frequency-based measures of orthographic 
structure (Solso & King, 1976). However, Massaro et al. 
(1930) found that the correlations between comparable type 
measures and token measures were very high. Measures based 
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on single letters, bigrams, and trigrams, both position 
sensitive and position insensitive, correlated between 
and .99. With such high correlations no meaningful 
discrimination between type and token measures can be madt 
unless test items are selected with this contrast in mind. 
For this reason we will oiscuss only measures based on the 
token counts derived by Massaro et al. (1930). 

The present analysis will be restricted to position- 
sensitive counts. Massaro et al . ( 1930) foun'' that 
posit ion-sens itive counts give consistently better 
descriptions of performance than do position-insensitivo 
counts. For single-letter frequency, for example, the* 
correlation with average accuracy was only .2 for position- 
insensitive counts but .62 for position-sensitive counts. 
The advantage of position sensitivity was attenuated, 
however, as the length of the n-gram increased. 

While the effects of frequency seem to be 
psychologically real, it is not necessary that the mental 
representations of frequency directly reflect the frequency 
of objective counts. One alternative scale that has been 
successful in other research is a logarithmic (base 10) 
scale. Mot only are there some data to suggest the 
possibility of a logarithmic representation (Massaro et al. 
1980; Solomon 4 Postman, 1952; Travers & Olivier, 197S; 
Taylor, Note °), but also ^ lo^r.rithmic representation is 
consistent with recent studies of number representation 
(Shepard £ Podgorny, 1973) and with many other psychological 
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scales. Therefore, we computed all of our frequency 
measures based upon both regular linear frequencies and log 
frequencies. Since counts were sometimes zero, the log of 
zero was defined as zero. Therefore , the two sets of 
measures being correlated were sums of position-dependent 
single letters, bigrams, and trigrams derived from either 
linear-frequency or log-frequency tables. 

Regularity Measures 

To provide a quantitative measure of the regularity of 
each of the 400 stimulus items, a simple count of the 
number of orthographic irregularities for each item was 
computed based on the rules developed by Massaro et al. 
(1980). The rules are given in Table 2. This measure of 
regularity provided a reasonable description of performance 
in the Massaro et al. (1980) studies. We will refer to this 
measure as Regular ity( 1 ) . One critical feature of the rules 
for Regularity( 1 ) is that letter strings are treated as 
monosyllabic and many legal and occurring medial consonant 
clusters are treated as irregular. For example, the word 
person would be considered to have an irregularity since 
according to rule 2, the medial consonant cluster £S would 
not be legal in initial position. However, the consonant 
cluster £S * s regular in medial position when considered as 
part of a two syllable word. Therefore, a second 
quantitative measure of regularity was derived that removed 
the constraint that the letter string must be considered as 
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Table 2 

The rules for an Irregularity Count (after Massaro et 
al. 1930) . 

1. Segment string into vowel and consonant substrings. 
Treat final -le as if it were -el. Treat h between 
vowels as a (legal) consonant, 

2. For each consonant string, determine minimal number of 
vowels which must be inserted to make the string 
pronounceable. Initial consonant clusters must be legal 
in initial position. Final consonant clusters must be 
legal in final position, including those followed by 
final e. Medial consonant clusters must be legal in 
initial position. 

3. Rate each resulting consonant substring for position- 
sensitive scribal regularity (count one for each 
irregular substring). 

For each vowel substring, determine minimal number of 
consonants which must b^ inserted to create scribally 
regular sequences. Mark as irregular illegal initial 
and final vowel substrings. 

5. Count number of inserted vowels and consonants plus 
number of scribally irregular consonant and vowel 
substrings. This yields an irregularity index. 

6. The vowel string ao, ae, oe, and ue (among more obvious 
cases) would be "illegal vowel springs, u would be 
illegal as a vowel in initial position and i, u, a, oa , 
and o would be illegal in final position, ue is legal 
as is jf as a single, non-initial vowel. 

7. h is not allowed in final position unless preceded by c, 
£, or s. 

3. £ and w between vowels are to be counted as consonants. 
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a monosyllabic string* This measure is referred to as 
Regularity(2) . 

The rules for Regularity(2) were identical to those for 
the first measure except that the application of the rules 
and the counting of the violations were carried out in order 
to minimize the number of violations for any given letter 
string. When possible, a syllable boundary was assumed in 
order to avoid a given violation. As an example, the medial 
consonant cluster md in the string limder would be an 
illegal consonant cluster in the same syllable because of 
the phonological rule governing the place of articulation of 
nasals followed by stops in a single syllable. The nasal 
and the following stop must share place of articulation; 
therefore mb and _nd are possible but not md or nb. A 
syllable boundary between m and <1 in limder is possible, 
however, resulting in a perfectly legal two-syllable string 
with no violations. Similarly, in the string nurdgi the 
medial consonant cluster rdg is legal with a syllable 
boundary between d and £. The only violation is i_ in final 
position • 

Frequency vs. Regularity 

The correlations of several measures with average 
accuracy are presented in Table 3. The correlation needed 
for statistical significance at £ = .01 with 198 degrees of 
freedom is .18. Of central interest is the relative ability 
of bigram frequency and regularity measures of orthographic 
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Table 3 

Correlations of Several Predictor Variables With 
Overall Accuracy Performance in Experiments 1 and 2 
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structure to predict performance. Two dummy variables were 
created to contrast these two measures while equating for 
the range and levels of each measure. The dummy regularity 
variabl 0 assigned a 1 to words and regular nonwords, and a 0 
to irregular nonwords. The dummy frequency variable 
assi-;/ied a 1 to words and high bigram frequency nonwords, 
and a 0 to low bigram frequency nonwords. In both 
experiments the regularity variables correlated much higher 
(.19, .51) with performance than did the frequency variable 
(.26, .37). 

It is not possible to choose between regularity and 
frequency measures of orthographic structure. Although the 
regularity counts do better than linear frequency counts and 
log single letter counts, log bigram and log trigram counts 
do better than regularity. Both measures account for a 
significant portion of the variance in performance. 
Regularity and frequency measures are positively correlated 
with each other. As an example, log trigram frequency and 
Regularity(2) correlate .47 and .16 for the items in 
Experiments 1 and 2, respectively. A multiple regression 
was carried out treating the summed frequency counts and the 
irregularity counts as independent variables. The best 
combination of predictors was log trigram frequency and 
Regularity(2) , which accounted for 15% of the variance in 
Experiment 1 and '49* in Experiment 2. 
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Linear word frequency correlated .45 and .51 while log 
word frequency correlated .55 and .59 with performance in 
the two experiments. The correlation with performance on 
just the 40 word items were .46 and .51 for linear and log 
frequencies in the first experiment and, .34 and .35 in the 
second experiment. Log word frequency was highly correlated 
with both log bigram frequency (.50, .52) and log trigram 
frequency (.77, .7*0. A dummy word frequency variable which 
assigned a 1 to words and a 0 to nonwords was more highly 
correlated (.60, .68) with performance than was log word 
frequency. Although it is possible that lexical status 
makes an independent contribution to performance, the high 
correlations between word frequency and sublexical 
orthographic structure measures preclude resolution of this 
issue. 

Serial Positon 

The correlations between the three log frequency 
measures at each serial position and overall performance 
are shown in Figures 8 and 9. In general, the frequency of 
n-grams at the beginning and end of the items predicts 
performance better than n-grams in the middle. To evaluate 
whether this effect is due to the informational constraints 
in the stimuli themselves, wc derived a measure of 
redundancy or predictability for each serial position. The 
variance of letter occurrences at each serial position was 
computed based on the table of frequencies given by Massaro 
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Figurp 8. Correlations of accuracy with log 
single-letter, log bigram, and log trigram frequencies 
as a function of their respective serial positions in 
the six-letter strings in Experiment 1. 
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Figure 9. Correlations of accuracy with log 
single-letter, log Digram, and log trigram frequencies 
as a function of their respective Serial positions in 
the six-letter strings in Experiment 2. 
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et al. (1930). High variance occurs to the extent that some 
letters occur more often and, therefore, are more 
predictable than others. These variance measures for the 
log single-letter, log bigram and log trigram counts are 
shown in Figure 10. For single letters there is less 
redundancy at the middle positions relative to the end 
positions. There is very little change in redundancy across 
serial positions for the bigram and trigram measures. The 
redundancy and performance measures are nicely correlated 
for single letters, but uncorrelated for bigrams and 
tr igrams . 

A second measure of redundancy was calculated by taking 
an average uncertainty measure H, based on Shannon's (1948, 
1 95 1 ) equation , 



where P is the probability of occurrence of a letter or 
letter cluster at a given position and, N is the total 
number of letters or letter clusters that occur at that 
position. These uncertainty measures are presented in 
Figure 11. Uncertainty measures the legree to which letters 
or letter clusters are unpredictable; we might expect better 
correlations with performance at those serial positions with 
small values of uncertainty. 

As can be seen Figure 11, the uncertainty measures do a 
feood job of predicting the performance measures for single 
letters- The only d iscrepancy is in the ^ifth letter 
position where subjects failed to exploit the redundancy at 
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Figure 10. Variance of linear and log single- 
letter, bLgram, and trigram occurrences at the 
respective serial positions in the six-letter strings. 
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Figure 11. Shannon's uncertaintly measure H, for 
single-3 otter, bigram and trigr.im occurrences at the 
respective serial positions for six-letter words. 
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this position. There is also a reasonable correspondence 
for the bigram counts. However, the initial bigram predicts 
performance better than the fin;jl bigrarn even though the 
ldtter has less uncertainty. There is no correspondence 
between the uncertainty measures and the performance 
correlations for the trigram counts. 

Multiple Regressions 

In a series of multiple regression analyses, the 
individual log counts at each serial position were treated 
as independent variables. Of concern was which serial 
positions made statistically significant contributions in 
accounting for the variance in the data. The orders in 
which the serial positions were entered in the equations 
were 3, 2, 5, 6, 4, and 1 for single-letters, 1, 3, 5, 4, 
and 2 for bigrams, and 1, 3, H, and 2 for trigrams. The log 
single-letter counts at the sixth, second, and 
positions accounted for 1 8% of the variance in Experiment 1. 
An analogous analysis in Experiment 2 accounted for 21% of 
the variance with positions 6, 2, and 1. For log bigrams, 
the first, fifth, and second positions accounted for 32J of 
the variance in Experiment 1. In the second experiment, the 
log bigram counts at the first, fifth, and third positions 
accounted for 45? of the variance. For log trigrams, the 
first, fourth, and second positions accounted for 38$ and 
^1% of the variance in Experiments 1 and 2 respectively. 
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Regressions were also conducted for summed log single- 
letter, lucrum, and trigram frequencies and regularity. 
Summed log bigram frequency dnd regularity were always 
entered into the equation first. For the first experiment, 
bigram frequency and regularity accounted for 35% of the 
variance. With these two variables in the regression 
equation, the partial correlations for the log single-letter 
and trigram frequencies were -.05 and .39, respectively. 
For the second experiment, bigram frequency and regularity 
accounted for H2% of the variances and the partial 
correlations for log single-letter and trigram frequencies 
were .08 and .36, respectively. 

Experiment 3 

The creation of the stimulus set for Experiment 3 was 
identical to that of the previous experiments except that 
log bigram rather than linear bigram counts were used and 
the strings were controlled more exactly for regularity. 
Figure 12 gives examples of the five classes of items and 
the average log bigram frequency for each class. The 
complete list of letter strings is presented in Appendix 3* 

In the studies of Massaro et al. (1980) and Experiments 
1 and 2 log- frequency measures gave consistently better 
descriptions of performance than did linear- frequency 
measures. Furthermore, Massaro et al. (1980) found that the 
log counts were superior to c range of power- function 
transformations of the linear counts. This result provides 
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Words 




Regular 



Orthographic 
Regularity 



Irregular 



Summed Positional 
Log Bigram 
Frequency 



High 


Low 


rodipe 


dripoe 


shulod 


lohuds 


(11.688) 


(3.523) 


diceon 


nidcoe 


tamgen 


nemtag 


(11.143) 


(7842) 


prdioe 


dpireo 


dhouls 


kxihds 


(11.625) 


(8.509) 


cnoied 


endcoi 


ntagem 


nagtme 


(11.083) 


(7883) 



Figure 12. Examples of the test words and their 
correspond ing anagrams from Experiments 3 and 4 . 
Within each of the five squares, the top two items 
correspond to hiflh word frequency and the bottom two 
items correspond to low word frequency. The number in 
each of the ten cells represents the summed positional 
log bicrnm frequency. 
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additional evidence that if frequency of occurrence is 
important, log frequency appears to be the best descriptor 
of this variable. 

A count of the number of irregularities in each letter 
string was determined using the rules fur Regularity( 3) 
presented in Table 4. The rules for Regularity( 3) were the 
same as for Regular ity(2) except that vowels as initial 
letters violated one of the rules and therefore were counted 
as irregularities. Given this formula, it was now possible 
to equate the number of irregularities for the anagrams that 
differed only in log bigram frequency. In our previous 
studies, the number of irregularities tended to correlate 
negatively with frequency and some of the effect of 
frequency could have been due to differences in regularity. 
This possibility was eliminated in the present study by 
equating the high- and low-frequency anagrams of a given 
test word for the number of irregularities. Consider the 
test word period shown in Figure 12. The regular high and 
regular low anagrams ( rod ipe and dr ipoe ) do not have any 
irregularities. The irregular high and irregular low 
jnagrams ( prdioe and dpireo ) have 2 irregularities each. 
This design might provide a more definitive contrast between 
frequency and regrlarity. 

Method 

Subjects . Nine University of Wisconsin summer school 
student volunteers were used as subjects and paid $9.00 for 
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Table *l 

Rules for Regular ity( 3) for the Selection of the Items 
Used in Experiments 3-6. 

1. For each string, rate for position-sensitive scribal 

regularity and pronounceabil ity (count one for each 
violation). Treat final -le as if it were -el. Treat 
h between vowels as a (legal) consonant. 

2. For each string, rate for position-sensitive scribal 

regularity and pronounceability (count one for each 
violation). Initial consonant clusters must be legal 
in initial position. Final consonant clusters must be 
legal in final position including those followed by 
final e. 

j. For each string, rate for position-sensitive scribal 
regularity and pronounceability (count one for each 
violation). The vowel strings ao, ae, oe, and ^e 
(among more obvious cases) would be illegal vowel 
strings. All vowels are illegal in initial position 
and jjf would be illegal as a vowel in initial position. 
The vowels i 9 u, a t oa, and o would be illegal in final 
position. ue is legal as is jf as a single non-initial 
vowel, n is not allowed in final position unless 
preceded by c, £, or s. £ anc * ^ between vowels are to 
be counted as~consonants . 
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their participation. All were native English speakers, had 
normal or corrected vis ion , and had not participated in any 

of the other experiments. 

St imul i and apparatus . Words were selected in the same 
manner as in previous two experiments. The high-frequency 
words had a Kucera and Francis (1957) frequency of at least 
50, and the low-frequency words had a frequency of 3- Due to 
a selection error, one low-frequency word had a frequency of 
4. For each of the 80 words, four anagrams were selected so 
that they formed a factorial arrangement of high and low 
sumned-positional log bigram frequency and of being 
orthographically regular or irregular. For each set of four 
anagrams, the number of irregularities were matched exactly 
for the regular conditions and then again for irregular 
conditions. Finally, an additional sample of words, 13 high 
and 13 low in word frequency, and their anagrams were 
selected as practice items. 

The 30 experimental words and their anagrams were 
divided into two lists. List 1 contained one-half of the 
high word frequency items and one-half of the low word 
frequency items. List 2 contained the remaining items. 

Stimuli were presented in the same manner and on the 
same equipment as in the previous experiments with only one 
exception. The range of durations for the test letter 
strings was 5-39 msec . Because of the algorithm used , 
decreasing the lower limit for the duration of the test 
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string increased the maximum duration of the mask to 35 
msec • 

Procedure . The experiment was conducted in a manner 
similar to that of the previous experiments. The 
presentation of the test string, masks, and target letters 
was indentical to that of Experiments 1 and 2. Subjects 
were tested on two consecutive days. At the beginning of 
each day, subjects began with a practice session of all 260 
trials. Two experimental sessions of 400 trials each 
followed the practice. Five of the subjects received all 200 
items of List 1 on Day 1 in the first session as both target 
and catch trials. These subjects then received the List 2 
items in the second sesion. On Day 2, List 2 was presented 
in the first session and List 1 in the second session. For 
the remaining four subjects, the order of the lists was 
reversed . 

R esul ts 

Figure 13 shows the average percentage correct on 
target and catch trials as a function of letter-string type. 
There were significant differences, F (4, 32) = 127.1, £ < 
.001, among the five types of letter strings. Words had a 
16^ advantage over the regular-high anagrams, F (1, 32) = 
55.1, £ < .001. There was U.0% advantage of regular strings 
over irregular strings, F (1, 8) 32.5, £ < .001, and a 1.*4* 
advantage of high log bigram frequency strings over low log 
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Figure 13. Percentage correct as a function of 
display type for target and catch trials in Experiment 
* . 
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bigram irequency strings, F (1, 8) = <^.6, £ > .2. 

One disquieting aspect of the results is the extreme 
asymmetry in performance target and catch trials and the 
interaction of this variable with display type, F (1, 8) = 
12.8, £ < .007, and F (4, 32) = 9.5, £ < .001. Subjects 
were extremely conservative in their willingness to indicate 
that a target letter was present. This result could reflect 
our failure to instruct the subjects specifically about the 
relative frequency of target trials as we did in the 
prev ious two experiments . 

Figure 14 gives average percentage correct for the high 
and low word frequency words and their anagrams as a 
function of letter-string type. High frequency words and 
their anagrams were recognized 3*2% more accurately than 
low-frequency words and their anagrams, F (1, 8) = 69.03 f £ 
< .001. The interaction between word frequency and the five 
types of items was not significant , showing that this 
d if f erence was not unique to the word items . Therefore , 
some variable other than word frequency must be responsible 
for the difference. However, one caveat is to realize that 
performance may not be on an interval scale, which weakens 
any interpretation of the lack of interaction. One solution 
would be to monitor each display type independently and to 
adjust the stimulus values to give an average of 755 correct 
for high and low word frequencies. If word frequency still 
does not interact with display type when average performance 
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Figure 11. Percentage correct as a function of 
display type for iteus corresponding to high and low 
word frequency in Experiment 3. 
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is about 75% correct at each display type, then the 
conclusion reached here would be r'.inforced. 

Correlational analysis 

The correlations of several variables with overall 
performance presented in Table 5. For all frequency 
measures, the log measures correlated more highly with 
performance than did the linear measures. Log trigram 
frequency predicted performance better than the other 
sublexical measures. Log word frequency was correlated with 
accuracy (.60), but also was correlated with log bigram 
frequency (.61) and log trigram frequency (.80). Among just 
the hig'i- frequency words the correlation with performance 
was -.05 and -.13, respectively, for linear and log <»ord 
frequencies. (Correlations among the low-frequency words 
would not be meaningful since all the items had the same 
Kucera and Francis frequency of occurrence). The lack of a 
significant correlation between performance and word 
frequency within the class of words , eplicates previous 
results (Manelis, 1974) and makes it unlikely that word 
frequency can account for the effects of orthegraphic 
structure. Lexical status alone might be an important 
variable, however. The dummy variable of word or nonword 
gave a highly significant correlation of .60 with 
performance . 

Multiple Regressions 
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Table 5 

Correlations of Several Predictor Variables 
with Overall Accuracy in Experiment 3 and 4. 





Ex. 3 


Ex. 4 


Ave. 


Single letter 






linear 


.29 


.26 


.31 


log 


.36 


.28 


.36 


Bigram 


.38 




.40 


1 inear 


.33 


log 


.53 


.43 


.54 


Tr igram 




.36 


.45 


linear 


.44 


log 


.59 


.48 


.60 


Word frequency 






.46 


linear 


.44 


.37 


log 


.60 


.50 


.63 


Regularity Count(3) 


.40 


.29 


.39 



ERJC 



56 



51 

Multiple regressions were carried out as in Experiments 
1 and 2. The log single-letter counts at the sixth, second, 
and first letter positions accounted for 12% of variance in 
Experiment 3. For the log bi^r?m counts, the first, fifth, 
and second positions accounted for 23% of the variance. 
Finally, for the log trigram counts, 36% of the variance was 
accounted for by the first, fourth, and second serial 
positions. 

For the regressions with summed log bigram frequency 
and Regular ity( 3) entered into the equation, 35% of the 
variance was accounted for. The partial correlations for 
log single-letter and log trigram frequencies were -.07 anci 
.28, respectively. 



Experiment 4 

Method 

Eight new University of Wisconsin undergraduates from 
Introductory Psychology who met the sarre requirements as in 
the previous experiments were used as subjects. Experiment 
l \ was an exact replication of Experiment 3 with one 
exception. The instructions were modified to inform 
subjects that a target letter would appear in the test 
string on 50% of the trials. It was expected that this 
manipulation would attenuate the asymmetry in the number of 
positive and negative responses. 
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Figure 15 shows the average percentage correct for 
target and catch trials for the five letter-string types. 
The significant differences among the letter-string types, 
F 28) = 102,7, £ < .001, completely replicate Experiment 
3. There was a 15.0% advantage of words over regular-high 
anagrams, F ( 1, 28) = 52.4, £ < -001; a 2.*»S advantage for 
regular strings, F (1, 7) = 10,2, £ < .025; and only a 0.7% 
advantage for high-frequency strings, F (1, 7) = .84, 

Though the responding asymmetry was substantially 
reduced, there nonetheless was still a tendency for subjects 
to remain conservative in their willingness to indicate that 
a target letter was present, F (1, 7) = 6.11, £ < «05. The 
range of performance across letter-string types was 23.0% 
for target trials and 13.3% for catch trials. 

Figure 16 presents the percentage correct for the 
letter-string types as a function of word frequency. There 
was an overall effect of 20.5% for high word frequency items 
and 15.8% for low word frequency items, F (J| f 28) = 2.19, £ 
< . 10. 

Correlation Analysis 

The correlations of several measures with overall 
performance in Experiment 4 are presented in Table 5. As 
with the previous analyses , log measures predicted 
performance better than did linear measures. Trigram 
frequency was the best of the three frequency measures, but 
only slightly better than bigram frequency. Log word 
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Figure 15. Percentage correct as a function of 
display type for target and catch trials in Experiment 

n. 
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Figure 16. Percentage correct as a function of 
display typo for items corresponding to high and low 
word frequency in Experiment 
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frequency was correlated .50 with overall performance. Also 
presented in Table 5 are the correlations of the same 
variables with the average performance in Experiment 3 and 
4. Since Experiment 4 was a replication of Experiment 3 f 
the performance on each item was averaged across the two 
experiments. As might be expected from increased 
reliability, these correlations are significantly larger 
than for either of the experiments considered separately. 

Mul t iple Regressions 

For log single-letter counts, the sixth, fifth, and 
second positions accounted for 7% of the variance. For log 
bi^ram counts, the first and fifth positions accounted for 
16% of the variance. For log trigram counts, the first and 
fourth positions predicted 25% of the variance. 

Summed log bigram frequency and Regularity( 3) accounted 
for 21% of the variance and the partial correlations for log 
single-letter and log trigram frequencies were -.05 and .22, 
respectively . 

Experiment 5 

•> 

Method 

Subjects . Nineteen fourth graders from the Madison 
Metropolitan School District participated as subjects and 
were paid $5.00. The children were tested in the middle of 
the school year, had normal or corrected vision, and were 
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administered the STEP (1979) reading test at the end of 
third grade. Of the 19, all but five had scored at or above 
their grade level on the STEP reading test. The STEP scores 
ranged from 25 to 50 with an average score of 43. 

Stimuli and apparatus . The 200 items used were those 
of List 1 from Experiment 3. Accordingly, one-half of the 
items were of high word frequency and one-half were of low 
word frequency. The anagrams represented a factorial 
arrangement of high or low frequency and regular or 
irregular. The same practice list as in Experiment 3 also 
was used. 

The stimuli were presented in the same manner and on 
the same equipment as in Experiment 3 with one exception. 
The range of durations of the test string presentation was 
increased to accommodate the less developed processing 
capabilities of fourth-grade subjects. 

Procedure . Subjects were tested in groups of 1-M. Upon 
arrival for the experiment, the fourth graders spent about 
10 minutes in various activities to allow them to adjust to 
our laboratory. They were then instructed as a group about 
wh-)t the experiment involved. This instruction proceeded in 
two steps. First, the children listened and watched the 
experimenter simulate the target search task using 4 index 
cards. The experimenter showed cards printed with a test 
string and a target letter and the children responded "Yes" 
or "No" aloud. After some coaching and about six of these 
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trials, the children performed the task without error. The 
second step involved explaining the task using the computer 
equipment. This also was performed as a group and each 
child attempted about 15 computer-generated trials while the 
remaining children watched. All adapted to the equipment 
readily and were able to perform the task. The children 
were then taken to their individual subject stations in 
separate rooms and tested on the practice list. 

As the children were responding, the computer displayed 
to the experimenter in an isolated room each child's average 
accuracy after each trial. This allowed the experimenter to 
monitor each child's progress, and, if necessary, to adjust 
the range of durations for the test letter string. The 
range was adjusted if any subject was not able to achieve 
755 accuracy even when the test string appeared for the 
maximum duration allowed by the range of durations. The 
nineteen subjects participated in six different groups. The 
six groups differed in terms of the range of durations that 
was used and whether a mask appeared. As in Experiments 3 
and A, the minimum string duration for all groups was 5 
msec. For three groups, the maximum string duration ranged 
from 59 to 179 msec. For these groups a mask was used. The 
minimum mask duration was always 1 msec. The maximum amount 
of time that the mask remained on the CRT was increased by 
exactly the same amount that the test-string maximu* 
duration was increased. Accordingly, the maximum mask 
duration ranged from 55 to 175 msec for the three groups. 
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The stimulus onset asynchrony (SOA) between the test string 
and the mask always was equal to the maximum string duration 
plus 31 msec. As a result, the SOAs ranged from 90 to 210 
msec. For all groups the interval between the onset of the 
test string and the onset of the target letter was increased 
by exactly the same amount as the increase in the maximum 
duration of the test letter string (see Figure 1). For 
example, when the string duration was increased 20 msec, the 
time between the onset of the string and the onset of the 
target was lengthened 20 msec. For the remaining three 
groups of subjects, the maximum string durations ranged from 
249 to 499 msec and the mask was eliminated. The target 
letter followed the test string after an interval equal to 
the maximum duration of the test string. 

Following the 100 practice trials, the subjects were 
presented with each of the 200 items twice in each of two 
sessions. Thus, each subject was presented with each item 
twice on target trials and twi<, i on catch trials. The 
children were given a five-minute rest period after the 
practice trials and after every 200 experiment ial trials. 
The entire experiment lasted about 90 minutes. 

Resul ts 

Despite the widely varying test durations, the six 
groups of subjects exhibited a similar pattern of results . 
Therefore, no distinction among the groups was included in 
the data analysis. One subject was eliminated because his 
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overall accuracy was at chance. 

Figure 17 presents the results for the fourth-grade 
subjects as a function of display type for target and catch 
trials. Accuracy was higher for better structured letter 
strings, F ( *J , 18) = 40.2, p < .001. Words were recognize'' 
9.9J better than the regular-high anagrams, F (1, 17) = 
15.5, £ < .005- There was a 3.0% advantage of regular over 
irregular anagrams, F (1, 17) - 9.2, £ < .01. Log bigram 
frequency had only a .2$ effect, F < 1. 

Figure 18 reveals the interaction of display type and 
word frequency, F (4, 68) = 2.5, £ = .05. This effect 
reflected a H.0% advantage of high frequency words over low 
frequency words, F (1, 68) = 4.6, £ < .05. 

The slight difference between target (66. 9%) and catch 
(70. 82) trials was not significant, F < 1 , and this variable 
did not interact with letter-string type, F < 1. 

Correlation Analysis 

Table 6 presents the correlations of performance with 
several predictor variables. Correlations increased from 
linear to log counts and with increases in the size of the 
frequency measure. Overall, the pattern of results obtained 
for the fourth graders is similar to that found for adult 
subjects . 

Multiple Regressions 
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Figure? 17. Percentage correct as a function of 
display 0 typo for target and catch trials in F.xper iment 
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Figure 18. Percentage correct as a function of 
display type for items corresponding to high and low 
word frequency in Experiment 5. 
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Table 6 

Correlations of Several Predictor Variables 
with Overall Accuracy Performance in 
Experiment 5. 



Single letter 

linear .23 

log .25 

Bigr am 

linear .35 

log .43 

Tr igram 

linear .38 

log .51 

Word frequency 

linear . 46 

log .55 

Reguarlity Count(3) .36 
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In a series of multiple re^res^ion analyses, the 
individual log counts at each position were treated as 
independent variables. The log single-letter counts at the 
sixth and first serial positions accounted for 7% of the 
variance. For log bigrams, the first, fifth, second, and 
fourth positions accounted for 22% of the variance. For log 
trigrams, the first and fourth positions accounted for 31%* 

Summed log bigram frequency and Regular i ty( 3) accounted 
for 2H% of the variance. The partial correlations for 
summed log single-letter and trigram frequencies were -.08 
and .28, respectively . 

Experiment 6 

In Experiments 1-5, large effects were found for words 
as compared to the best anagrams (regular-high). One way to 
account for the effect is by the lexical status of the 
words. Since words are represented in the reader's lexicon, 
they may be retrieved on the basis of partial visual 
information. For example, the partial information sho_l_ 
might lead to recognition of the word should. Lexical 
access would allow determination of the two unknown letters. 
On the other hand, the partial information shu_o_ can not 
access any lexical entry and the missing letters can not be 
determined. Consequently, on a word trial there is a better 
chance that all of the component letters will be available 
for comparison gainst the target letter. In contrast, the 
same partial information about an anagram will not lead to 
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recognition of all of the letters in the test string. As a 
result, fewer letters of anagrams will be available for 
comparison against the target letter. This account is 
consistent with the model articulated in the Introduction; 
the secondary recognition process can lead to word 
recognition without complete recognition of all of the 
component letters, 

A second explanation of the word advantage is that 
words differ from even the best anagrams with respect to 
sut lexical orthographic -tructure. For example, the bigram 
frequency of the words in Experiments 3 and 4 averaged 
almost three log units more than that for the regular-high 
anagrams (see Figure 12). Perhaps accuracy was greater for 
words because words contained more frequent bigrams. 

To choose between these two exp^nations in Experiment 
6, the words of Experiments 3, *», and 5 were replaced with 
regular anagrams which were matched with the words on log 
bigram frequency. If log bigram frequency was the basis of 
the word advantage, then a similar advantage should be 
observed for these regular-very high (R-VH) anagrams. 

Method 

Experiment 6 was conducted in the same manner as the 
previous experiments with adult subjects. Regular ^ry high 
anagrams with similar log bigram frequencies to the words of 
Experiments 3 and 4 were used along with all the anagrams of 
Experiments 3 and 4. The summed L* 6 ram frequencies for the 



ERIC 



70 



65 

regular-very high anagrams wore 'is. 940 and 13* ^07 
respectively, almost identical to those of the words (see 
Figure 12). The regular-very high anagrams are listed in 
Appendix 4. The experiment was identical to Experiment 4 
except that the regular-very high anagrams were used in 
plac? of the word items. Seven new subjects obtained from 
the same Introductory Psychology subject pool used in the 
previous experiments participated in exchange for course 
credit. They were informed that a target would occur on 50% 
of the trials and that none of the letter strings were 
words . 

Results 

Figure 19 shows the differences among the five types of 
letter strings, F (4, 24) = 3.3, P < .05. There was a 0.3% 
advantage of regular very-high anagrams over regular-high 
anagrams, F (1, 6) < 1. Regular anagrams gave a 3.6% 
advantage over irregular anagrams, F (1, 24) = 13.8, £ < 
.005. There was r A -0.2% effect for log bigram frequency, F 
(1, 6) < 1. 

There was only a 3.0% advantage of catch over target 
trials, F (1, 6) < 1, but this variable interacted with the 
type of letter string, F (4, 24) = 3.3, £ < .05. This test 
reflects the presence of 7.4% increase in accuracy from the 
worst to the best structured strings for target trials, but 
an absence of an effect of orthographic structure for catch 
trials (Figure 19). 



ERLC 



71 



66 



100 



90 



QC 80 

8 



LU 

a. 

60 



50 



T 



o— -o Catch 
•• — « Target 
Average 



o 










1 



1 



R-VH R-H 



R-L 



l-H 



l-L 



Figure 19. Percentage correct at a function of 
display type for tnrget and ontch tri.-ils in Experiment 
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Figure 20 presents the average accuracy as a function 
of letter string type according to whether the items 
correspond to nigh or low word frequency. Neither the 0.6% 
difference nor the interaction with display type wa3 
statistically significant. 

The results of Experiment 5 support the idea that 
lexical status makes a significant contribution to 
perceptual recognition in the target search task . The 
contribution of lexical status in the earlier experiments 
can not be attributed to sublexical orthographic structure 
differences in log bigram frequency. That is to say, the 
reader takes into account not only the frequency of 
occurrence of letter sequences and the regularity of these 
sequences, but also whether or not a particular sequence is 
represented in a word. Frequency and regularity allow 
well-structured anagrams to be better recognized than poorly 
structured anagrams an1, in addition, lexical status allows 
a perceptual advantage of words over equally well-structured 
anagrams . 

Correlation Analysis 

Table 7 presents the correlations of several variables 
with overall performance. The correlations, while 
attenuated , exhibit a pattern similar to the previous 
experiments. Log measures are better than linear measures; 
the regularity measure does about as well as the best 
frequency measure. 
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Figure 20. Percentage correct as a function of 
display type for items corresponding high and low 
word frequency in Experiment 6. 



ERIC 



74 



Table 7 

Correlations of Several Predictor 
Variables with Overall Accuracy Performance 
in Experiment 6. 



Single letter 

linear .11 

log .11 

Bigram 

linear .05 

log .11 

Tr igram 

linear .03 

log .05 

Regularity Count(3) --1 1 * 
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Multiple Regressions 

Summed log bigram frequency and Regularity( 3) accounted 
for 3% of the variance. The partial correlations for summed 
log single-letter and trigram frequencies were .01 and -.06, 
respectively. 

Experiment 7 

Experiments 1-6 were not successful in choosing between 
frequency and regularity measures of orthographic structure. 
In a final evaluation of these measures, the perceptual 
recognition task was replicated with five display types. 
The display types were chosen to give a large range of 
regularity and frequency. The comparisons among display 
types and the post hoc correlations will be used to evaluate 
the relative contributions of lexical status, regularity, 
and frequency in perceptual recognition . 

Method 

The R-VH anagrams from Experiment 5 and the words, 
regular-low anagrams, and irregular-high anagrams from 
Experiment 3 were used as items along with a new type of 
anagram that was both very irregular and very low in log 
bigram frequency. The very irregular, very low (VI-VL) 
anagrams mostly had three or four irregularities and 
average log bigram frequencies of 4.845 and 4.971 for the 
high and low word frequency items, respectively. The VI-VL 
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anagrams are listed in Appendix 4. The procedure of 
Experiment 4 was replicated exactly. Eleven new subjects 
fro-n the Introductory Psychology subject pool were used. 

Resul ts 

As can be seen in Figure 21, accuracy uniformly 
increased with better structured letter strings; F (4, 40) = 
69.4, £ < .001. Words had a 10.8% advantage over the 
regular-very high anagrams, F (1, 40) = 15.9, £ < .001. 
Regular-very high anagrams gave a performance advantage of 
4.9* over regular-low anagrams, F (1, 40) = 3.1, £ < .086; a 
7.6% advantage over irregular-high anagrams, F (1, 40) = 
7.9, £ < .01; and a 8.9% advantage over very irregular-very 
low anagrams, F (1, 40) = 10.7, £ < .005. 

The 6.9% advantage of catch over target trials was not 
significant, F (1, 10) = 1.5, £ > .25, and this difference 
did not interact with diplay type, F < 1. 

Figure 22 reveals a 2.7% difference between levels of 
word frequency, F (1, 10) - 32.17, £ < -001, but the 
advantage of high word frequency occurred only for the words 
and the regular anagrams. 

Correlations and Regressions 

The correlations of several predictor variables with 
overall performance are presented in Table 8. As usual the 
log measures predict performance better than the linear 
measures. Bigrams and trigrams were similar in predictive 
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Figure 21. Percentage correct as a function of 
display type for target and eaten trials in Experiment 
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display type for sterns corresponding to high and low 
word frequency in Experiment 7 
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Table 3 

Corrections of Several Predictor 
Variables with Overall Accuracy in Experiment 7. 



Single letter 




linear 


.35 


log 


.37 


Bigrara 




linear 


.no 


log 


.50 


Tr igram 


.41 


1 inear 


log 


.59 


Word frequency 


.38 


1 inear 


log 


.54 


Regularity Count(3) 
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ability. Log word frequency correlated .42 with overall 
performance , but also correlated .48 with summed log bigram 
frequency and ,75 with summed log trigram frequency. 
Considered together, the results of Experiment 7 replicate 
previous experiments. 

Summed log bigram frequency and Regular ity( 3) accounted 
for 27% of the variance. The partial correlations for log 
single-letter and log trigram frequencies were -.04 and .36, 
respectively. 

Experiment 8 

The perceptual recognition task assesses the degree to 
which readers utilize orthographic structure in visual 
processing of letter strings. An overt judgment task has 
also been used to assess the degree to which Ihis knowledge 
is available for a conscious report (Maosaro et al., 1930; 
Rosinski & Wheeler, 1972). An overt judgment task is used 
in the present experiment to assess the degree to which 
regularity and frequency are consciously available. 
Subjects are given pairs of letter strings and asked to 
choose which letter string most resembles English r-elling. 
Some subject', are instructed to oase their decision on the 
frequency of occurrenca of letter sequences in English 
spelli <>; o.her subjects are instructed to respond on the 
basis of t * k regularity of letter sequencing. The se^en 
types of letter strings varying .n lexical status, 
regularity, and log bigram frequency were pairod with each 
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other in the task. The degree to which subjects can follow 
instructions and discriminate among the types of items 
should reveal which aspect(s) of orthographic structure is 
(are) consciously available and capable of report. 

Method 

Subjects . Sixteen Introductory Psychology students who 
met e same requirements as in the first *ix experiments 
were used as subjects. 

Stimuli and apparatus . The ^80 letter strings 
represented all seven categories of items used in the 
previous experiments. Accordingly, 40 words and their 
corresponding R-H, R-L, I-H, and I-L anagrams of Experiment 
3, '10 R-VH anagrams of Experiment 6, and 40 VI-VL anagrams 
of Experiment 7 were selected. The irregular items chosen 
from Experiment 3 had two irregularities. Sever categories, 
and allowing the two letter strings oi a pair to be from the 
same category, result in 28 unique pairs of categories, 
s ese 28 pairs were sampled randomly without replacement in 
each block of 28 trials. The actual ittms from each of the 
categories for each pair were randomly selected with 
replacement for each group of subjects. Eventually, each 
subject was presented with 840 pairs for judgment, resulting 
in a total of 30 observations for each pair. The two 
springs of each pair were arranged side-by-side on the CRT. 
The horizontal visual angle cf each letter string was 1.9 
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dc -rees with a 2.7 degree separation between strings. 
Depending on instructions, subjects selected the one of the 
two strings which was more regular or more frequent. 
Subjects indicated their choice by pressing one of two keys 
located beneath the string. Each trial began with a 250 
msec fixation point followed by the two letter strings. The 
strings remained on the CRT until all the subjects responded 
or for a maximum of four seconds. The 810 pairs were 
presented in two ?ssions of 420 trials each. Each session 
lasted abouc 20 minutes. Of the 16 subjects, 8 were given 
the regularity instructions and 8 were given the frequency 
instructions. The exact instructions are given in 
Appendices 5 anJ 6 respectively. 

Results 

For each subject, the proportion of times that each of 
the seven categories was chosen as .aost like English over 
the other six categories was computed. These proportions 
were entered into on analyses of variance with instructions, 
category type, and subjects as factors. Figure 23 presents 
the percentage of choices of most like English as a function 
of category and Instructions , There waa, a large decrease in 
choices with decreases in orthographic structure, F (6, 8*0 
= 351 , £ < .001. However, instructions had no influence on 
performance and did not interact with structure, Fs < 1. 
All differences between adjacent categories in Figure 23 are 
statistically significant, except for the small difference 
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Figur ?3. Average percentage choices of each of 
the disp! y types in the overt judgment task in 
Experiment 3. 
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between R-H and R-L items. 

Table 9 presents the proportion of times each of the 
seven classes of items was chosen over the other six 
classes. Some effects of instructions not apparent in the 
average proportions shown in Figure 23 are seen in these 
results. The R-H items were picked over the R-VH items 39* 
of the time regularity instructions and 25% of the time 
for frequency instructions. The three classes of irregular 
items (I-H, I-L, and VI-VL) were chosen an average of 13% of 
the time over the R-L items with regularity instructions and 
an average of 18* of the time with frequency instructions. 
Regular-very high items were chosen over words 17? of the 
time for regularity instructions and 25% of the time for 
frequency instructions. All of these results are consistent 
with the idea that frequency carries somewhat more weight in 
the overt judgment task with frequency instructions than 
with regularity instructions. 

An analysis of variance also was conducted on the 
reaction times of the choice responses. Response type was 
included as a factor to assess the differences in reaction 
times between choosing a given category as most like English 
relative t the average reaction time for choosing the other 
six categories. Figure 24 presents the reaction times as a 
function of instructions , response type f and category . 
Overall reaction times were 232 msec longer for frequency 
than for regularity instructions, F (1, 14) = 6.9, 2 < -025. 
Reaction times increased with decreasing orthographic 
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Table 9 

The Proportion of Times the Row Item was Chosen Over 
the Column Item for Regularity and Frequency 
Instructions . 

Regularity Instructions 





Word 


R-VH 


R-H 


R-L I 


-H I-L . 


Word 


.55 










R-VH 


.17 


.52 








R-H 


.12 


.39 


.50 






R-L 


.15 


.32 


.50 


.51 




I-H 


.02 


.11 


.21 


. 17 


.118 


I-L 


.01 


.09 


.15 


.17 


.36 .?8 


VI-VL 


.00 


.02 


.09 


.06 


.28 .30 






Frequency Instructions 




Word 


R-VH 


R-H 


R-L I- 


H I-L VI 


Word 


.55 










R-VH 


.25 


.50 








R-H 


.'8 


.25 


.57 






R-L 


.11 


.32 


.11 


.55 




I-H 


.07 


.111 


.27 


.25 


.50 


I-L 


.02 


.09 


.17 


.20 


.38 .52 


VI-VL 


.03 


.05 


.09 


.09 


.211 .35 



»i-VL 



.53 



.19 
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Figure 24. A/erage reaction times for choosing 
each of the display types (selected) and for choosing 
the alternative member of the pair (unselected) in 
Experiment 8. 
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structure, but only when that category was chosen as the 
item most like English, F (6, 84) = 12.7 and 21.0, £S < 
.001 . 

A comparison between the perception task and overt 
judgment task shows that the latter is a much more sensitive 
measure of the reader's knowledge of orthographic structure. 
Subjects are able to discriminate among certain classes of 
items in the overt judgment task that are responded to 
equivalently in the perceptual accuracy task. For example, 
R-VH and R-H were differentiated in the overt judgment task 
but were responded to equivalently in the perceptual 
accuracy task of Experiment 6. The same was true of the T-H 
vs T-L and the I-H vs VI-VL contrasts. Othe^ results were 
exactly parallel in the two tasks: Words have an advantage 
over regular items and regular items have an advantage ovf*r 
irregular items. 

Discussion 

In summary, the orthogonal contrasts of lexical status, 
word frequency, position-sensitive frequency, and regularity 
provide evidence for the following conclusions . Lexical 
status provides a perceptual advantage of words over equally 
well-structured anagrams. Word frequency appears to add 
very little, if anything, beyond that accounted for by 
lexical status. Regular anagrams are recognized 
significantly better than irregular anagrams whereas log 
bigram f reqeuncy has no influence when regularity is 
controlled. However, the post hoc correlations of these 
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measures of orthographic structure with perceptual 
recognition of each of the '400 test items complicate these 
conclusions somewhat. Log bigram frequency usually 
correlates positively with performance on letter strings 
even after the influence of lexical status and regularity 
has been removed. In this regard, frequency measures allow 
a fine-grained description of orthographic structure that 
provides a good index of performance on individual letter 
strings. The binary classification of lexical status and 
the small range of the number of irregularities limit the 
usefulness of these measures as descriptions of a relatively 
continuous variation in orthographic structure. Some 
frequency weighting of regularity description might lead to 
a improved measure of structure. Until such a description 
is developed, it appears necessary to include lexical 
status, frequency, and regularity to account for those 
components of orthographic structure that are 
psychologically real. The results of the present studies 
also are relevant to previous studies cf orthographic 
structut e. 

The utilization of orthographic structure in reading 
was first studied by Miller, Bruner, and Postman (1 P 5"> , who 
had subjects reproduce letter sequences presented 
tachistoscopically. The strings were all eight lette *s and 
corresponded to different approximations to English b^sed on 
Shannon's (1948, 1951) algorithms. Miller ft *1. found that 
performance improved with better approximations to English. 
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By correcting for the relative information per letter in the 
strings, the amount of information transmitted was shown to 
be equal for the various approximations. This result is 
consistent with more recent empirical and theoretical work 
demonstrating that orthographic structure provides an 
independent source of information to the reader (Massaro, 
1979a, b; Massaro et al. 1980). Tn fact, the 
approximation-to-English algorithms may be viewed as early 
descriptions of orthographic structure. Accordingly , the 
more recent studies replicate and extend the Miller et al. 
results. The major advances in the recent studies are the 
more precise descriptions of structure (see Massaro et al., 
1980, Chapter 3) and the quantitative modeling of the 
processes by which visual information combines with 
structure during word recognition (Massaro, I979a f b) . 

Related research by Gibson and her colleagues evaluated 
the role of word length and pronounceabil ity in a full 
report of letter strings by both heading and deaf readers 
(Gibson, Pick, Osser, 4 Hammond, 1962; Gibson, Shurcliff, 4 
Yonas, 1970). They found that the number of errors 
increased with increases in word length and decreased with 
increases in pronounceability . Tn the post hoc regression 
analyses, word length accounted for 72% of the variance and 
pronounceabil ity accounted for another 1 5 % • Position- 
sensitive and word length specific bigram and trigram 
frequencies were sign! f icantly poorer prodictors of 
performance. However, these counts can not be used in any 
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straightforward manner for items of various letter lengths. 
Words of different lengths do not occur with equal frequency 
and the less frequent word lengths will naturally have 
bigrams and trigrams with smaller counts. Therefore, this 
comparison can not be considered an adequate test between 
pronounceabil ity and frequency measures of orthographic 
structure. The finding that deaf and hearing readers were 
influenced similarly by pronounceabil ity argues that 
orthographic structure rather than pronounceability is the 
important structural variable. 

Manelis (197'D found an advantage of four-letter words 
over pseudowords in tachistoscopic recognition, but failed 
to find a significant correlation between recognition and 
summed linear bigram and trigram frequencies, as measured by 
Mayzner and Tresselt (1965) and Mayzner, Tresselt and Wolin, 
(1965). In a more recent study, McClelland and Johnston 
(1977) independently varied position-sensitive bi K 
frequency and lexical identity in four-letter strings in a 
Reicher-Wheeler forced-choice task (Reicher, 1969; Wheeler, 
1970). Tn addition, a full report of the four letters 
either preceded or followed the forced-choice response. The 
forced-choice responses revealed no efiect of either bigram 
frequency or an advantage of words over orthographically 
regular pseudowords. Forced-choice responses also showed a 
13$ advantage of words and pseudowords over single letters, 
replicating the word-letter difference of Reicher (1969). 
The full report score replicated the absence of a bigram 
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frequency effect found for the forced-choice task, but 
showed an advantage of words ov^r pseudov/ords . Also, the 
full report score revealed a lcjrge word frequency effect. 
In post hoc analyses, McClelland and Johnston report that 
bigram frequency did not correlate with perceptual accuracy 
wherea i , single- letter position frequency was highly 
correlated with accuracy. 

Using a different task, Henderson and Chard (1930) 
presented items either high or low in both position- 
sensitive single-letter and bigram frequencies in a lexical 
decision task. Their results indicate that second and 
fourth graders were faster in rejecting low-frequency than 
high-frequency six- letter non words. In a related study , 
Bouwhis (Note 3) found that single-letter positional 
frequency correlated with lexical decisions for three-letter 
itt.-s in Ducch. Reaction times to words decreased while 
reaction times to pseudowords increased with increases in 
single- letter frequency. Similarly , subjects tended to 
respond "word" more often to both words and pseudowords if 
the items were of high single- letter frequency. In 
contrast, these correlations were considerably diminished 
when bigram positional frequency was used as the predictor 
variable. Bouwhis 1 results when compared with those of 
Henderson and Chard (1930), imply that the power of the 
bigram frequency measure with our six-letter items may not 
generalize completely to smaller letter- string lengths. 
That is, bigram frequency appears to have more predictive 
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power than single-letter frequency when the items are six 
letters in length, but the reverse is indicated when the 
items are three or four letters in length. However, such a 
conclusion is tenuous for two reasons. First, single-letter 
and bigrams measures are highly correlated even for small 
letter-string lengths . Second , experiments demonstrating 
the predictive power of single-letter frequencies have used 
linear rather than log counts. Since log counts are 
uniformly better predictors of the data, it will first have 
to be shown that log counts do not change the relative power 
of the two frequency measures. Of the several studies 
(McClelland & Johnston, 1977; Bouwhis, Note 3) investigating 
the relative contributions of single-letter and bigram 
frequencies, only the present studies directly compare 
linear and log single-letter and bigram counts. The present 
studies found that log counts are consistently better than 
linear counts and that bigram counts are better than 
single-letter counts. 

In the first of two experiments investigating other 
structural variables, Spoehr ( 1978) showed that report 
accuracy in the Reicher-Wheeler task was lower for five- 
letter, one-syllable scrings made up of five phoneme than 
those, made up of four phonemes. Performance for words such 
as thump and pseudowords such as s herk averaged 76% correct 
wnereas, performance was H% worse for words such as spank 
and pseudowords such as crost . Average accuracy on words 
was 1% greater than on pseudowords. In the second 
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experiment, two- syllable words were recognized 1 3% more 
poorly than one-syllable words when phoneme length was 
equated . Although Spoehr ( 1973) showed that position- 
sensitive bigram frequency of the letters could not account 
for the observed difierences, log counts might have been 
more appropriate . Furthermore , since our counts were 
derived from the considerably larger Kucera and Francis 
(1967) corpus, they are likely more reliable than the 
Mayzner and Tresselt (1965) counts that Spoehr employed. 
Accord ingly , Spoehr 1 s results are not necessarily 
inconsistent with the present results. 

In conclusion, a number of recent experiments have 
failed to fi>nd significant effects of position sensitive 
bigram frequency in the perceptual recognition of letter 
strings. However, these *,udies all used linear rather than 
log counts and we have found much larger effects with the 
leg counts. Linear and log counts correlate .8^ and .66 for 
single letter and bigram position sensitive counts for four 
letter words, .86 and .76 for five letter words and .85 and 
.76 for six letter words in the Kucern-Francis corpu? 
Therefore, there is sufficient room for improvement of log 
over linear counts in accounting for perceptu?l recognition. 
Previous studies and analyses of completed experiments 
should evaluate log as well as linear counts to provide a 
sufficient test of frequency measures of perceptual 
recognition . 
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Appendix 1 



The 200 stimulus item from Experiment 1. The items 
listed are in columns accordi^ to type, where W=word; R- 
H=regular, high; I-H=irregular , t .n; R-L-regular, low; and 
I-L = irregular , low. Each row contains a word and the four 
anagrams generated from it. The number beside each item 
indicates the number of orthographic irregularities that the 
item contains as determined by the rules in Table 2. The 
items are also grouped in high and low Kucera and Francis 
(1967) word frequency. 

High Word Frequency 

R-L I^H I-L 

amtols 1 stlmao 4 mltsoa 4 

orudan 1 nroudg 2 ndauor 2 

rosuce 0 csouer 2 reucso 1 

nirdug 0 nurdgi 1 grdniu M 

fliste 0 eistlf 1 fetsli 1 

nogerl 0 nlgoer 3 grlneo 2 

gimank 0 agkinm 1 gmkian 2 

noderm 0 dnmoer 3 monrde 1 

rhetom 0 ormteh 1 tmoreh 2 

nutear 0 unrtea 2 ntreua 2 

runemb 0 bemrnu 2 brnemu 2 

rohet3 0 strheo 2 etrsoh 2 

iperod 1 cipoier 2 opdrei 1 

cipbul 0 licpbu 2 pbiclu 2 

ronase 0 nroaes 2 rsaneo 2 

cenods 0 oscned 1 osndce 2 

hudols 0 lsoudh 3 lohdsu 2 

ocasil 1 calsoi 1 oaiscl 2 

drawot 0 wtaord 2 rtwado 2 

endrut 0 retdnu 2 unedtr 1 



almost 0 
around 1 
course 0 
during 0 
itself 0 
longer 0 
making 0 
modern 0 
mother 0 
nature 0 
number 0 
others 0 
period 1 
public 0 
reason 0 
second 0 
should 0 
social 0 
toward 0 
turned 0 



R-H 
latoms 0 
adourn 1 
coures 0 
nurdig 0 
fiselt 0 
roleng 0 
mikang 0 
remond 0 
hemort 0 
tanure 0 
rumben 0 
horest 0 
ipoder 1 
piculb 0 
sonare 0 
snoced 0 
hoduls 0 
sicoal 0 
wartod 0 
rundet 0 
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Low Word Frequency 
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Appendix 2 

The 200 stimulus items from Experiment 2. The number 
of irregularities is indicated beside each item. 
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Low Word Frequency 
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Appendix 3 

The '100 items used in Experiments 3 and 4 along with 
perO'?nlaj$c correct for each experiment (E3, E4) summed log 
s injjle-letter frequency (SL), summed log bigram frequency 
(BI), summed log trigram frequency (TRI), log word frequency 
(FREQ) and number of irregularities (I). 
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a 3 inkt 69.4 84.4 19-93 7.25 1.65 0.00 2 

orawta oy.4 68.7 21.14 3.31 2.35 0.00 2 

uerndt 77.8 65.6 21.75 9.39 2.43 0.00 2 

oulmev 69.4 65.6 18.21 9-30 2.00 0.00 2 

kdewla 53.3 68.7 17.85 6.06 0.00 0.00 2 

deatnw 55.6 56.2 22.05 9.35 3-79 0.00 1 

drkoew 66.7 68.7 21.02 6.60 0.00 0.00 1 
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Low Word Frequency 
Words 



ITEM 


E3 




E4 




SL 


BT 


TRT 


FREQ 


_r 


cod ing 


3579 


78T1 


22.56 


15. 


73 


H 
i 


.61 


0 


.48 


0 


coined 


91 . 


7 


87. 


5 


23.62 


15. 


84 


7 


.84 


0 


. 48 


0 


confer 


97. 


2 


84. 


K 


22.31 


13. 


13 


4 


.57 


0 


.48 


0 


consul 


30. 


6 


75. 


0 


21 .39 


12. 


05 


3 


.73 


0 


Jl Q 

. 4o 


0 


copj ed 


33. 


3 


73. 


1 


23.50 


1 4 • 


43 


c 
O 


• 99 


u 


Jl Q 


U 


dear uh 


97. 


2 


90. 


6 


21.91 


14. 


O ( 


Q 

O 


1 ji 

• 1 4 


'J 


11 Q 


U 


delays 


97. 


2 


90. 


6 


21 .50 


13. 


72 


4 




U 


Jl Q 
• HO 


J 


depict 


83. 


9 


90. 


6 


22 . 20 


13. 


73 


11 
4 


• 70 


U 


Jl Q 


U 


d ispel 


91 . 


7 


87 . 


5 


2^.23 


1 2 . 


1 4 


z> 


• 1 1 


U 


11 Q 


U 


d ivers 


9^. 


4 


37. 


5 


22 . 28 


13. 


94 


Q 

O 


• 4b 


U 




U 


easing 


77. 


3 


71 . 


9 


22.75 


15. 


37 


9 


• 99 


0 


Jl Q 

• 43 


1 


faiths 


SO. 


6 


84. 


4 


22.24 


13. 


61 


c 
O 


.93 


0 


Jl Q 

. 48 


0 


f am i n c 


33. 


3 


84. 


4 


23.26 


14. 


93 


9 


. 85 


0 


n 0 
. 48 


0 


fathom 


38. 


9 


84. 


4 


21 .67 


14. 


26 


7 


. 32 


0 


. 48 


0 


forage 


77. 


8 


84. 


4 


22.70 


13. 


70 


7 


.81 


0 


. 48 


0 


forged 


38. 


9 


37. 


5 


23.49 


15. 


73 


9 


.63 


0 


ii 0 
. 48 


0 


frayed 


91 . 


•7 

1 


84. 


4 


22. 1 1 


14. 


49 


3 


.49 


0 


ii 0 
.48 


0 


gamble 


97. 


2 


96. 


9 


22.01 


14. 


03 


4 


. 34 


0 


ii 0 
. 48 


0 


glazes 


38. 


9 


90. 


6 


20.99 


1 1 . 


95 


4 


.49 


0 


.48 


0 


golfer 


91 . 


7 


96. 


9 


21 .74 


1 1 . 


26 


3 


.61 


0 


ii 0 
. 48 


0 


gulped 


80. 


6 


78. 


1 


22.47 


13. 


19 


5 


.57 


0 


ii 0 
.48 


0 


hounds 


91. 


7 


81. 


2 


21 .65 


13. 


60 


9 


.01 


0 


.48 


0 


hurdle 


77. 


8 


93. 


7 


22.50 


14. 


27 


7 


.39 


0 


.48 


0 


jur ist 


98. 


9 


84. 


4 


21 .72 


14. 


45 


8 


.08 


0 


.48 


0 


lather 


75. 


0 


81 . 


2 


23.23 


16. 


33 


1 1 


.16 


0 


.48 


0 


magnet 


88. 


9 


37. 


5 


22.82 


13. 


48 


6 


.01 


0 


.48 


0 


masked 


91 . 


7 


96. 


9 


23. 18 


14. 


65 


7 


.78 


0 


.60 


0 


outcry 


86. 


1 


87. 


5 


21 .47 


7. 


34 


1 


.91 


0 


.48 


1 


pun ish 


91 . 


7 


78. 


1 


22.01 


12. 


92 


6 


. 38 


0 


11 0 

. 48 


0 


r av ing 


94. 


4 


81 . 


2 


?2. 39 


15. 


77 


9 


.44 


0 


.48 


0 


repaid 


33. 


3 


87. 


5 


22.15 


12. 


95 


7 


.13 


0 


.48 


0 


scrape 


94. 


4 


90. 


6 


21 .88 


12. 


00 


5 


.33 


0 


.48 


0 


s inful 


91 . 


7 


81 . 


2 


20.58 


10. 


21 


5 


.05 


0 


.48 


0 


slated 


30. 


6 


37. 


5 


23.70 


15. 


98 


9 


.31 


0 


.48 


0 


spiced 


91 . 


7 


81 . 


2 


22.43 


14. 


26 


7 


.24 


0 


.48 


0 


sui tan 


36. 


1 


75. 


0 


22.54 


11. 


97 


2 


. 18 


0 


.43 


0 


tripod 


91 . 


7 


84. 


4 


22. 16 


10. 


85 


2 


.60 


0 


.48 


0 


tropic 


36. 


1 


84. 


4 


20.83 


13. 


56 


5 


.24 


0 


.48 


0 


truism 


33. 


9 


31 . 


2 


21.19 


10. 


16 


4 


.64 


0 


.48 


0 


unlock 


86. 


1 


87. 


5 


19.86 


10. 


14 


4 


.14 


0 


.48 


1 
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Regular-High 

ITEM El Eft SL BI TBI FREQ I 

dlcgon 69T14 62.5 21 .78 c :'.92 1.78 0.00 0 

diceon 61.1 81.1 22.26 ',1.63 2. 03 0.00 0 

cefnor 77.8 63.7 22.61 10.91 0.00 0.00 0 

conlus 80.6 65.6 22.03 10.15 1.63 0.00 0 

diecop 72.2 50.0 19.85 10.15 1.96 0.00 0 

rhetad 72.2 50.0 22.36 10.82 1.01 0.00 0 

ladsey 52.8 75.0 22.13 12.17 3.22 0.00 0 

dipect 91.7 87.5 21.77 12.52 5.19 0.00 0 

lipsed 75.0 31.2 22.55 12.59 3.66 0.00 0 

vierds 77.8 78.1 CI . 16 12. 10 5.21 0.00 0 

isagen 75.0 59.1 21.13 11-92 1.16 0.00 1 

tashif 55.6 50.0 20.99 10.58 2.96 0.00 0 

mifane 61.1 78.1 22.55 11.71 2.30 0.00 0 

thamof 69.1 68.7 20.85 10.10 2.68 0.00 0 

gafeor 66.7 78.1 22.17 10.10 1.81 0.00 0 

foderg 77.8 81.2 22.11 11.55 1.88 0.00 0 

fedary 66.7 65.6 22.15 11.32 2.37 0.00 0 

begalm 72.2 78.1 21.66 10.65 3-81 0.00 0 

zaesel 66.7 75.0 19.32 7.66 1.36 0.00 0 

fogler 91.1 81.1 22.69 12.02 3.11 0.00 0 

gudpcl 72.2 75.0 21.36 9.86 1.61 0.00 0 

hunjds 66.7 78.1 21.66 11.51 2.59 0.00 0 

hudler 91.7 90.6 22.11 11.67 3.82 0.00 0 

jisurt 72.2 78.1 21.68 11.50 3.59 0.00 0 

lehart 77.8 59.1 22.12 12.62 3.01 0.00 0 

tamgen 81. 3 81.1 22.72 11.65 1.63 0.00 0 

kadems 63.9 68.7 20.65 10.15 0.00 0.00 0 

otucry 75.0 71.9 20.86 8.72 0.60 0.00 1 

hinsup 72. *> 81.1 19.33 10.56 2. 19 0.00 0 

vignar 77.8 59.1 2 ( . 2 1 1 1 .79 3.80 0.00 0 

riedap 72.2 75.0 20.02 10.35 0.00 0.00 0 

rescap 30.6 71.9 20.18 11.32 5.76 0.00 0 

lufins 72.2 63.7 22.57 12.27 3-65 0.00 0 

lesdat 80.6 68.7 22.20 12.08 1.81 0.00 0 

pidecs 36.1 90.6 22.30 12.36 1.83 0.00 0 

tanuls 66.7 78.1 23.05 13.86 2.10 0.00 0 

tirpod 72.2 81.2 22.65 10.72 0.60 0.^0 0 

coprit 80.6 78.1 21.85 11.18 1.68 0.00 0 

triums 77.8 68.7 21.85 10.15 1.90 0.00 0 

uncolk 72.2 75.0 20.20 10.20 2.51 0.00 1 
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Regular-Low 



ITLM 


V O 






u 1 


TDT 

1 H 1 


FREQ 


T 
1 


nidcog 


od.7 


7o . l 


on "ft 7 
2U • 0/ 


A — ?9 
0.1c 


n nn 


n 

U • 


An 

UU 


u 


nidcoe 


00.7 


68.7 


21 . 38 


f • 20 


n nn 


n 

U % 


nn 

UU 


n 
U 


cerf on 


0 "3 O 

83.3 


OO. f 


01 Oft 

2 1 . oy 


o n c 
y • U9 


1 9 0 
1 • d 1 


n 

U . 


nn 
UU 


n 
U 


scolun 


00.7 


AC A 

09 • o 


o 1 Ji n 
2 1 • MU 


7 A A 
f • 00 


1 C ll 
1 • 9 *4 


n 

U • 


nn 
UU 


n 
U 


pidcoe 


A A 1 

00.7 


CO 1 


01 0 il 

2 1 • oM 


7 C A 
f • OO 


n nn 


n 

U . 


nn 
UU 


n 
U 


tedhar 


0 1 . 1 


C O 1 

53* 1 


o 1 0 o 

21.83 


0 O O 

0 . 33 


n nn 
U • UU 


n 

U . 


nn 

UU 


n 
u 


saldye 


a o n 
0 3 • 9 


C A O 
9 0 . <L 


9 1 OA 

d i . y o 


ft 1 A 
0 • I 0 


9 n 1 

d 9 U I 


n 


nn 


n 
w 


l 10 pec 


o o • r 


ft 1 9 


9 1 70 
£ I . 3U 


7 U7 
f • f 


0 no 


0 

\j 9 


00 
w w 


0 

w 


x en p l s 


AO U 
0y • *4 


7 1 Q 


91 ft5 

£ I . 0 9 


ft Q1 
0 • j 1 


fj 85 


0 

\J 9 


00 

w w 


0 

w 


5 e v r iu 


fto A 


QO A 


99 58 

£ C 9 J U 


10 U 1 

1 V l 1 1 


1 5Q 


0 . 


00 

w w 


0 


O rf <i m c O 

ag 1 II 56 


A A 7 


A5 A 
w ^ • u 


9n ftQ 


8 16 


0.00 


0 . 


00 


1 


nisi a u 


77 ft 
f f • O 


A9 5 


9n ftn 


7 6Q 
f • 


0 . 00 


0. 


00 


0 


ncii cam 


AQ U 


7 1 . Q 


90 15 


8. in 


0.00 


n 

*J 9 


00 

w w 


0 


monai u 


A A 7 


7 1 Q 

( i • y 


90 OR 


A 99 


0 00 

\J 9 \J\J 


0 . 


00 


0 


i« a 'j f /"^ fT 

real ug 


AQ 11 

O J . H 


75 n 


91 m 


7 7H 


2. 76 


0. 


00 


0 




AQ U 


78 1 


99 11 


8. 56 


0. 00 


0 . 


00 


0 


W /\ 11 P «■> 

rey i a -* 


7 9 9 
t d • d 


A5 A 


90 Q7 


7 Q5 
f • j j 


0 00 




00 

w w 


0 


me luag 


A A 7 
OOi [ 


on A 

TjU . o 


9 1 A7 

C 1 tO f 


7 U7 
f • " j 


1 5? 


n 

\j 9 


00 
w w 


0 

w 


ze si ag 


A "3 O 

o 3 • y 


7 tz n 
f 9 • u 


1 0 911 


9 • D9 


0 00 


0 

\J 9 


00 
w w 


0 

w 


roi I eg 


AO O 

o 5 • y 


Ac; A 
O j • o 


9 1 ftO 


7 75 
f • f 9 


1 7 7 
1 • f 3 


n 

\j 9 


00 
w w 


0 

w 


ped lug 


A A 7 
00. ( 


f 9 • u 


0 1 9A 
c I • dO 


5 ftft 
9 • OO 


n nn 

U • U U 


0 

\j 9 


no 

\i w 


n 

w 


nosu un 


OA 1 
0 O . I 


All U 

OH* H 


91 97 


ft 1 U 


n nn 


n 


00 

WW 


0 

w 


neirua 


7 9 9 
( d • d 


All ll 


0 1 ft7 
d \ • O i 


7 U 1 

f • H I 


1 QU 


0 

\j 9 


00 
w w 


0 

w 


j i sru t 


7 9 O 
( d* d 


7 ft 1 
f O • I 


on 7n 

d\J . f u 


A nn 

O • U 4 


n nn 


n 

u • 


nn 

UU 


0 

w 


tenar 1 


•7 c n 
ft). 0 


AC A 

0 9 • o 


01 70 


O OO 

y • dy 


1 57 
I • 93 


u • 


nn 

UU 


n 

w 


n em tag 


oi. y 


7Q i 

f o. 1 


00 no 


7 7n 

f • f u 


n nn 

u • UU 


n 

u • 


nn 

UU 


n 

w 


medka s 


CO O 
90 . 3 


Q 1 9 
O 1 • £ 


9 1 on 


7 ?A 

f • ^>o 


1 AQ 

1 iU; 


n 

u • 


nn 

WW 


n 

w 


o tcruy 


On A 

80 • o 


7 c n 
ft). 0 


on An 
•cU • ou 


CZ Q O 

9 • y 5 


n nn 

u • UU 


n 

u • 


nn 

UU 


1 

I 


3 iphun 


A 1 1 
0 1.1 




d 1 • 


7 Aft 
f • 00 


n nn 

u • UU 


n 

U • 


nn 

UU 


n 

w 


v inr ag 


*7 O 

72.2 


7 1 . y 


£ 1 • £ 1 


ft OA 
O • 50 


1 nn 

I • UU 


n 

U • 


nn 

UU 


r. 
w 


reipad 


69 • 4 


78. l 


OO ll C 


0 AO 
O • Oo 


n n n 
U • UU 


0. 


00 


n 
U 


0 r\ V* r"\ o *-x 

s erpao 


o j . y 


7ft 1 
f O . I 


91 QU 


8 7A 


9 50 

C 9 J 1 J 


0. 


00 


0 

w 


lisf un 


72.2 


56.2 


20.41 


8.50 


1 .98 


0. 


00 


0 


tedlas 


69.4 


62.5 


22.06 


8.96 


1 .76 


0. 


00 


0 


sipdec 


72.2 


78. 1 


21.73 


7.49 


0.00 


0. 


00 


0 


tanlus 


72.2 


75.0 


21.88 


9.76 


1 .92 


0. 


00 


0 


r idpot 


72.2 


78.1 


21 .66 


9. 12 


1.83 


0, 


0. 


0 


r itcop 


69. 4 


90.6 


20.31 


8.86 


1 .40 


0, 


00 


0 


t isrum 


o 1 . 1 


62.5 


20.70 


8.51 


1.61 


0. 


00 


0 


ulnock 


33.3 


81.4 


19.84 


4.56 


0.48 


0.00 


1 
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T Tiry 


E3 




.-\ ti ,f ■% r\ rl 

0 ilg lOu 


77. 


8 


cno Led 


7 5. 


0 


f rncco 


72. 


2 


sncu Lo 


77. 


3 


e icopd 


61. 


1 


enatrd 


72.2 


y l ad ss 


63. 


9 


pet icd 


77. 


8 


pi d i se 


75. 


0 


sd v i er 


69. 


4 


a iengs 


69. 


4 


f hasit 


69. 


4 


f aiemn 


63. 


9 


a thomf 


52. 


o 

8 


reoagf 


6 1 . 


1 


dcorgf 


63. 


9 


derayf 


72 . 




eamblg 


69. 




aglesz 


63. 


9 


glf oer 


yo . 


r 

0 


pldgeu 


72. 


2 


dhuson 


52. 


o 

o 


uhr led 


o3. 


9 


t ur i s j 


83. 


3 


htaler 


36. 


1 


n tagem 


61. 


1 


GKa setn 


47. 


2 


yo t rcu 


55. 


6 


s pn in u 


77. 


8 


grv ina 


69. 




ariedp 


50. 


0 


prcacs 


88. 


9 


f uinls 


75 


.0 


sz .tld 


69 


.4 


pscied 


58 


.3 


unatls 


66 


.7 


itoprd 


66 


.7 


i trcop 


75 


.0 


mitrsu 


66 


.7 


loncku 


66 


.7 



Irregular-High 

Ef£ SL BI TRI FREQ I 

62.5 22. '1 3 1 0 . 0"3 6~704 0.00 2 

75.0 23.66 11.65 2.20 0.00 2 

75.0 20.59 10.91 1 .00 0.00 2 

87.5 21.10 10.18 0.00 0.00 2 

75.0 21.76 10.22 0.00 0.00 2 

65.6 22.76 10.97 0.00 0.00 2 
62.5 21.23 12.20 5.55 0.00 1 
56.2 23. 19 12.35 3-36 0.00 1 
78. 1 22.02 12.46 2.23 0.00 1 

78.1 22.03 11.76 5.84 0.00 1 

62.5 22. 2^ 11.94 4.01 0.00 2 
71.9 21.79 10.57 1.38 0.00 1 

63.7 22.00 11.70 4.11 0.00 2 

65.6 20.05 10.36 0.90 0.00 2 

62.5 21.40 10.05 0.00 0.00 2 

68.7 21 .20 1 1 .26 4.22 0.00 1 
68.7 20.71 11.39 2.93 0.00 1 

65.6 21.78 10.56 1.94 0.00 2 
78.1 17.03 7.56 2.00 0.00 2 
87.5 22. 15 12. 1 1 0.00 0.00 1 

62.5 20.01 10.00 2.09 0.00 2 

53.1 21.48 11.37 2.91 0.00 1 

65.6 22.68 11.85 4.58 0.00 2 

84.4 18.74 11.68 7.38 0.00 2 

81.2 22.50 12.83 5.04 0.00 1 
71.9 21.38 11.10 3.88 0.00 1 

62.5 19.94 8.42 1.59 0.00 1 
71.9 18.94 3.71 0.00 0.00 2 

59.4 19.95 10.47 3-76 0.00 2 

75.0 20.72 11.83 4.49 0.00 2 

62.5 20.10 10.28 3.48 0.00 2 
96.9 23.06 11.49 0.48 0.00 2 

68.7 22.58 12.09 2.89 0.00 1 

65.6 23.60 12.07 0.00 0.00 2 

68.7 22.80 12.40 2.20 0.00 1 
81 .2 22.20 13-84 3-75 0.00 2 

78.1 21.89 10.73 1-38 0.00 2 

81 .2 19.65 11.16 0.00 0.00 2 
65.6 20.20 10.37 0.00 0.00 2 
59.4 18. $8 10.39 3-10 0.00 2 
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ITEM Jo 

oidcng 52.8 

endcoi 61.1 

efcnor 58.3 

uslnoc 72.2 

oed icp 61.1 

adhtre 58.3 

sdelay 80.6 

tcepid 36.1 

dpesil 75.0 

veisrd 69.4 

gsanei 77.8 

sfaith 63.9 

fnieroa 66.7 

ofahtm 50.0 

faegro 75.0 

dgroef 55.6 

daefry 75.0 

maeblg 86.1 

gszela 61.1 

rlfoge 55.6 

elupdg 77.8 

huodns 66.7 

hlderu 44.4 

surjti 72.2 

hletar 83.3 

nagtme 80.6 

deaksm 75.0 

tcouyr 63.9 

pshinu 63.9 

ravngi 63.9 

rpeida 52.8 

earscp 69.4 

fnilua 72.2 

lesdta 58.3 

sipdce 41.7 

snltau 61.1 

diprto 75.0 

icrpot 66.7 

srtumi 80.6 

nlcoku 63.9 



Irregular-Low 
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n nn 
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7 AO 
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n nn 

u . uu 
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OH • ** 


OA ll A 
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ft OA 
0 • cO 


n lift 


n nn 
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1 O ftO 
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A P3 


n nn 

U . UU 
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on Ao 
tU • Ot 


7 AO 
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n nn 
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n nn 
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OA AO 
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ft RA 
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Appendix 4 



The regular, very-high anagrams (R-VH) matched 
with respect to summed log bigram frequency, and 
very irregular, very-low anagrams (VI-VL). The 
number of orthographic irregularities is indicated 
beside each item. 



High Word 

R-VH 
almots 1 
aiourd 1 
hebind 0 
bondey 0 
briged 0 
chager 0 
mocing 0 
cotuny 0 
couser 0 
desing 0 
crited 0 
gurind 0 
fimaly 0 
frined 0 
durong 0 
navigh 0 
loreng 0 
tarkem 0 
morned 0 
kaming 0 
tauner 0 
burmen 0 
otsher 1 
poried 0 
pornes 0 
calped 0 
calpes 0 
posint 0 
cublip 0 
saried 0 
surtle 0 
hosuld 0 
limpes 0 
kating 0 
watcrd 0 
nurted 0 
vomule 0 
wakeld 0 
watend 0 
wroked 0 



Frequency 

VI-VL 
amtslo 3 
aoudnr 4 
dnbhei 3 
nbdyeo 3 
bdeigr 2 
rchgae 3 
ngcmio 3 
ctnyou 3 
srceou 3 
ndgsei 3 
rcdtei 3 
dgnriu 4 
Imyifa 3 
rfdnei 3 
ngrduo 4 
gvnaih 3 
rglnoe 3 
mtrkae 3 
nrmdoe 3 
gmnkia 3 
nraetu 3 
rbmneu 4 
ohsetr 3 
pdreio 3 
nrpsoe 3 
dlaecp 3 
pcaesl 3 
nstpoi 3 
bpiucl 3 
sraedi 3 
tseulr 3 
dlsuoh 4 
mlpsei 3 
ktngai 3 
dtaowr 3 
tduenr 3 
muoevl 4 
Ikdwea 3 
wtaedn 3 
krwdoe 3 
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Low Word Frequency 



R-VH 
docing 0 
conied 0 
focren 0 
nucols 0 
poiced 0 
hardet 0 
saldey 0 
citpea 0 
sidple 0 
visder 0 
aniges 1 
fasith 0 
famien 0 
fomath 0 
foager 0 
geford 0 
fardey 0 
gembal 0 
galzes 0 
leforg 0 
peguld 0 
shudon 0 
hureld 0 
jurits 0 
thaler 0 
gemant 0 
smaked 0 
ocutry 1 
hupins 0 
varing 0 
peraid 0 
ceraps 0 
nifuls 0 
dastle 0 
piceds 0 
slanut 0 
pidrot 0 
pocirt 0 
tumsir 0 
uncolk 1 



VT-VL 
cgdnio 3 
dcoeni 3 
fnoecr 
nclsou 
dpoeci 
hrdtae 
syaedl 
tcpdei 
ldpsei 
rsiedv 
isgnae 
fthsai 
fnmaei 
taomfh 
gf aoer 
drgfoe 
f rdyae 
lmbgea 
zslgae 
rfldoe 
dleupg 
sdhnou 
ldhreu 
tjrsiu 
lhaetr 
tmngae 
sdmkea 
uoytcr 
hpiusn 
rngvia 
rdpaei 
sraepc 
lniuf s 
dtaesl 
pdcsei 
ltnsau 
tdrpio 
rcptio 
tmrsiu 
ukcnlo 



3 
3 
3 
3 
3 
3 
3 
3 
3 
3 
M 

3 
M 
2 
3 
3 
3 
3 
3 
3 
3 
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3 
3 
3 
3 
3 
M 
M 
3 
3 
3 
3 
3 
3 
3 
3 
M 



9 

:RLC 



114 



Ill 



Appendix 5 



Regularity Instructions for Pr.ired-Judgments 

Thank you for participating in our research. 

Our research is directed at discovering what it is 
about English words that allows us to look at strings of 
letters and judge how much they resemble words. Tn trying 
to understand the structure present in words, we have 
focused on an important component property — regularity of 
letter sequencing . For example, there are many consonant 
sequ?nces which can begin words (e.g., wh, fr, dr, and fl) 
but can net end words. Similarly, some regular consonant 
sequences 3t the end of words (ng, Id, ct, and Is) can not 
begin words. There are regular vowel sequences (e.g., ea, 
and ou) and irregular sequences (e.g., aa and ae) . Finally, 
there is regularity in how these consonants and vowel 
groupings are themselves sequenced within words. 

You will be shown pairs of six-letter strings which 
have been constructed to vary along this dimension of letter 
sequence regularity. That is, some strings preserve normal 
letter sequencing while others violate normal sequencing to 
a lesser or greater degree. Some of the strings will be 
words, but most will be meaningless. Whether a string is a 
word is not directly relevant. 

The object of this experiment i*s to determine how well 
you can judge the regularity of letter sequencing in the 
six-letter strings. You are to evaluate both members of 
each pair and choose the more "regular" of the two, e.g., 
the string that is the most regular in terms of letter 
sequencing. Make your choice by hitting the button on the 
same side as the more "regular" member of the pair. 

The first 10 pairs should give you a feel for the task. 
Try to work reasonably quickly and at a fairly steady pace. 
You tn'ist make a choice for each pair, guessing if necessary. 

The experiment is divided into two sessions, with a 5 
minute break between them. The entire experiment should 
tako a little over an hour. At the beginning of each 
session, you will see "START" on the screen. When this 
happens, be sure to press the "start bar" on the keyboard in 
front of you. When performing the tasks, you should sit 
comfortably with your two index fingers resting lightly on 
the two response buttons not covered by cardboard. Be sure 
not to apply pressure on the cardboard cover on the 
keyboard, as this can cause the keys to be accidently 
depressed . 
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Appendix 6 



Frequency Instructions for Paired-Judgments 

Thank you for participating in our research. 

Our research is directed nt discovering what it is 
about English words that allow us to look at strings of 
letters and judge how much they resemble words. In trying 
to understand the structure present in words, we have 
focused on an important component property — the frequency of 
letter groups occurring at specific positions . Groups oT 
letters occur with varying frequency at different positions 
within words. For example, you can probably think of words 
that end in ed and words that end in l_s, but ed is about ten 
times more frequent than Is in final position. Likewise, 
st and d£ both occur at tTve beginning of words, but sjb is 
over 100 times as frequent in initial position. In the 
middle of words, the vowel group ai is almost 10 times as 
frequent as the vowel group ao . 

You will be shown pairs of six-letter strings which 
have been constructed to vary along the dimension of letter 
group frequency by position. That is, some strings have 
been created with letter groups that are very frequent in 
those positions in English words while other strings have 
letter groups that are very infrequent in those positions. 
Some of the strings will be words, but most will be 
meaningless. Whether a string is a word is not directly 
relevant. 

The object of this experiment is to determine how well 
you can judge the frequency of letter groups in the six- 
letter strings. You are to evaluate both menbers of each 
pair and choose the more "frequent" of the two, e.g., the 
string that has the most frequent ^etter groups at the 
appropriate positions. Make your choice by hitting the 
button on the same side as the more "frequent" member of the 
pair. 

The first 10 pairs should give you a feel for the task. 
Try to work reasonably quickly at a fairly steady pace. You 
must make a choice for each pair, guessing if necessary. 

The experiment is divided into two sessions, with a 5 
minute break between them. The entire experiment should 
take a little over an hour. At the beginning of each 
session, you will see "START" on the screen. When this 
happens, *e sure to press the "start bar" on the keyboard in 
front of you. When performing the tasks, you should sit 
comfortably with your two index fingers resting lightly on 
the two response buttons not covered by cardboard. Be sure 
not to appl y pressure on the cardboard cover on the 
keyboard, as this can cause the keys to oe accidently 
depressed . 
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